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The  main  results  obtained  and  published  during  the  period  covered  by  this  report,  August  1988 
-  July  1989,  are  described  below  together  with  references  given  to  the  corresponding 
publication. 

1.  The  Interacting  Multiple  Model  Algorithm  for  Systems  with  Markovian  Switching  Coefficients, 
(Henk  A.  Blom  and  Yaakov  Bar-Shalom,  IEEE  Transactions  on  Automatic  Control  Vol.  33, 
No.  8,  August  1988) 

An  important  problem  in  filtering  for  linear  systems  with  Markovian  switching  coefficients 
(dynamic  multiple  model  systems)  is  the  one  of  management  of  hypotheses,  which  is  necessary 
to  limit  the  computational  requirements.  A  novel  approach  to  hypotheses  merging  has  been 
developed  for  this  problem.  The  novelty  lies  in  the  timing  of  hypotheses  merging.  Wnen 
applied  to  the  problem  of  filtering  for  a  linear  system  with  Markovian  coefficients  this  yields  an 
elegant  way  to  derive  the  interacting  multiple  model  (IMM)  algorithm.  Evaluation  of  the  IMM 
algorithm  makes  it  clear  that  it  performs  very  well  at  a  relatively  low  computational  load.  These 
results  imply  a  significant  change  in  die  state  of  the  an  of  approximate  Bayesian  filtering  for 
systems  with  Markovian  coefficients. 

► 

2.  Failure  Detection  Via  Recursive  Estimation  for  a  Class  of  Semi-Markov  Switching  Systems, 
(L.  Campo,  P.  Mookeijee  and  Y.  Bar-Shalom,  Proceedings  1988  IEEE  CPC,  Austin,  Texas) 

An  area  of  current  interest  is  the  estimation  of  the  state  of  discrete-time  stochastic  systems  with 
parameters  which  may  switch  among  a  finite  set  of  values.  The  parameter  switching  process  of 
interest  is  modeled  by  a  class  of  semi-Markov  chains.  This  class  of  processes  is  useful  in  that 
it  pertains  to  many  areas  of  interests  such  as  the  failure  detection  problem,  the  target  tracking 
problem,  socio-economic  problems  and  in  the  problem  of  approximating  nonlinear  systems  by 
a  set  of  linearized  models.  It  is  shown  in  this  paper  how  the  transition  probabilities,  which 
go  ern  the  model  switching  at-each  time  step,  can  be  inferred  via  the  evaluation  of  the 
conditional  distribution  of  the  sojourn  time.  Following  this,  a  recursive  state  estimation 
algorithm  for  dynamic  systems  with  noisy  observations  and  changing  structures,  which  uses 
the  conditional  sojourn  time  distribution,  is  derived  and  and  applied  to  a  failure  detection 
problem. 
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3.  Distributed  Adaptive  Estimation  with  Probabilistic  Data  Association,  (K.C.  Chang  and  Y. 
Bar-Shalom,  Automation.  Vol.  25,  No.  3,  pp.  359-369,  1989) 

The  probabilistic  data  association  filter  (PDAF)  estimates  the  state  of  a  target  in  a  cluttered 
environment.  This  suboptimal  Bayesian  approach  assumes  that  the  exact  target  and 
measurement  models  are  known.  However,  in  most  practical  applications,  there  are  difficulties 
in  obtaining  an  exact  mathematical  model  of  the  physical  process.  In  this  paper,  the  problem  of 
estimating  target  states  with  uncertain  measurement  origins  and  uncertain  system  models  in  a 
distributed  manner  is  considered.  Fust,  a  scheme  is  described  for  local  processing,  then  the 
fusion  algorithm  which  combines  the  local  processed  results  into  a  global  one  is  derived.  The 
algorithm  can  be  applied  for  tracking  a  maneuvering  target  in  a  cluttered  and  low  detection 
environment  with  a  distributed  sensor  network. 


4.  An  Adaptive  Dual  Controller  for  a  MIMO-ARMA  System,  (P.  Mookerjee  and  Y.  Bar-Shalom, 
IEEE  Transactions  on  Automatic  Control.  Vol.  34,  No.  7,  July  1989) 


An  explicit  adaptive  dual  controller  has  been  derived  for  a  multiinput  multioutput  ARMA 
system.  The  plant  has  constant  but  unknown  parameters.  The  cautious  controller  with  a 
one-step  horizon  and  a  new  dual  controller  with  a  two-step  horizon  are  examined.  In  many 
instances,  the  myopic  cautious  controller  is  seen  to  turn  off  and  converges  very  slowly.  The 
dual  controller  modifies  the  cautious  control  design  by  numerator  and  denominator  correction 
terms  which  depend  upon  the  sensitivity  functions  of  the  expected  future  cost  and  avoids  the 
turn-off  and  slow  convergence.  Monte-Carlo  comparisons  based  on  parametric  and 
nonparametric  statistical  analysis  indicate  the  superiority  of  the  dual  controller  over  the  cautious 
controller.  —  - 
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5.  Time-Reversion  of  a  Hybrid  State  Stochastic  Difference  System,  (Henk  A.P.  Biom  and 
Yaakov  Bar-Shalom,  Proc.  1989  IEEE  Intn'l,  Conf,  on  Control  &  Applications.  Jerusalem, 
Israel,  April  1989  to  appear  in  IEEE  Trans.  Info,  Theory.  1990) 

This  paper  develops  the  reversion  in  time  of  a  stochastic  difference  equation  in  a  hybrid  space, 
with  a  Markovian  solution.  The  reversion  is  obtained  by  a  martingale  approach,  which 
previously  led  to  reverse  time  forms  for  stochastic  equations  with  Gauss-Marlcov  or  diffusion 
solutions.  The  reverse  time  equations  follow  from  a  particular  non-canonical  martingale 
decomposition,  while  the  reverse  time  equations  for  Gauss-Markov  and  diffusion  solutions 
followed  from  the  canonical martingale  decomposition.  The  need  for  the  non-canonical 
decomposition  stems  from  the  hybrid  state  space  situation.  The  non-Gaussian  discrete  time 
situation  leads  to  reverse  time  equations  that  incorporate  a  Bayesian  estimation  step. 

6.  A  New  Controller  for  Discrete-Time  Stochastic  Systems  with  Markovian  Jump  Parameters,  (L. 
Campo  and  Y.  Bar-Shalom,  1 1th  IFAC  World  Congress.  Tallinn,  USSR,  Aug.  1990 

A  realistic  stochastic  control  problem  for  hybrid  systems  with  Markovian  jump  parameters  may 
have  the  switching  parameters  in  both  the  state  and  measurement  equations.  Furthermore,  both 
the  system  state  and  the  jump  states  may  not  be  perfectly  observed.  Prior  to  this  work  the  only 
existing  implementable  controller  for  this  problem  was  based  upon  a  heuristic  multiple  model 
partitioning  (MMP)  and  hypothesis  pruning.  In  this  paper  a  stochastic  control  algorithm  for 
stochastic  systems  with  Markovian  jump  parameters  was  developed.  The  control  algorithm  is 
derived  through  the  use  of  stochastic  dynamic  progamming  and  is  designed  to  be  used  for 
realistic  stochastic  control  problems,  i.e.,  with  noisy  state  obeservations.  The  state  estimation 
and  model  identification  is  done  via  the  recently  developed  Interacting  Multiple  Model 
algorithm.  Simulation  results  show  that  a  substantial  reduction  in  cost  can  be  obtained  by  this 
new  control  algorithm  over  the  MMP  scheme. 
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7.  From  Piecewise  Deterministic  To  Piecewise  Diffusion  Markov  Processes,  (Henk  A.P.  Blom, 
Proc.  IEEE  CPC  1988^ 

Piecewise  Deterministic  (PD)  Markov  processes  form  a  remarkable  class  of  hybrid  state 
processes  because,  in  contrast  to  most  other  hybrid  state  processes,  they  include  a  jump 
reflecting  boundary  and  exclude  diffusion.  As  such,  they  cover  a  wide  variety  of  impulsively 
or  singularly  controlled  non-diffusion  processes.  Because  PD  processes  are  defined  in  a 
pathwise  way,  they  provide  a  framework  to  study  the  control  of  noh-diffusion  processes  along 
same  lines  as  that  of  diffusions.  An  important  generalization  is  to  include  diffusion  in  PD 
processes,  but,  as  pointed  out  by  Davis,  combining  diffusion  with  a  jump  reflecting  boundary 
seems  not  possible  within  the  present  definition  of  PD  processes.  This  paper  presents  PD 
processes  as  pathwise  unique  solutions  of  an  Ito  stochastic  differential  equation  (SDE),  driven 
by  a  Poisson  random  measure.  Since  such  an  SDE  permits  the  inclusion  of  diffusion,  this 
approach  leads  to  a  large  variety  of  piecewise  diffusion  Markov  processes,  represented  by 
pathwise  unique  SDE  solutions. 

8.  Control  of  Discrete-Time  Hybrid  Stochastic  Systems  (L.  Campo  and  Y.  Bar-Shalom,  to  appear 
in  Proc.  1990  ACC,  under  review  for  IEEE  T-AC). 

A  realistic  stochastic  control  problem  for  hybrid  systems  with  Markovian  jump  parameters  can 
have  the  switching  parameters  in  both  the  state  and  measurement  equations.  Furthermore,  both 
the  system  state  and  the  jump  states  are,  in  general,  not  perfectly  observed.  Currently  there  are 
only  two  existing  controllers  for  this  problem.  One  is  based  upon  a  heuristic  multiple  model 
partitioning  (MMP)  and  hypothesis  pruning.  The  other  utilizes  the  entire  future  tree  of  models, 
and  is  called  the  Full-Tree  (FT)  controller.  The  performance  of  the  latter  is  superior  to  the 
former  and  their  complexities  are  similar.  In  this  paper  we  present  a  new  stochastic  control 
algorithm  for  stochastic  systems  with  Markovian  jump  parameters.  This  control  algorithm  is 
derived  through  the  use  of  stochastic  dynamic  programming  and  is  designed  to  be  used  for 
realistic  stochastic  control  problems,  i.e.,  with  noisy  state  observations.  This  new  scheme, 
which  is  based  upon  the  interaction  of  r  (the  number  of  models)  model-conditioned  Riccati 
equations,  has  a  natural  parallelism  and  is  straightforward  to  implement.  The  state  estimation 
and  model  identification  is  done  via  the  recently  developed  Interacting  Multiple  Model 
algorithm.  Simulation  results  show  that  a  substantial  reduction  in  cost  can  be  obtained  by  this 
new  control  algorithm  over  the  MMP  scheme.  Furthermore,  the  performance  of  the  new 
algorithm  is  shown  to  be  practically  the  same  as  that  of  the  FT  scheme  even  though  the  new 
scheme,  which  has  a  fixed  amount  of  computations  at  each  step  of  the  recursion,  is  much 
simpler  to  implement  than  both  the  MMP  and  FT  algorithms. 
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9.  Discrete  Time  Point  Process  Filter  for  Image  Based  Target  Mode  Estimation  (C.  Yang  and  Y. 
Bar-Shalom,  to  be  submitted  to  1990  IEEE  CPC). 

The  performance  of  tracking  and  prediction  systems  of  a  maneuvering  target  can  be  improved 
by  using  additional  (and  unconventional)  measurements  of  its  apparent  modes,  typically 
provided  by  an  imaging  sensor.  A  model  for  the  image-based  observation  channel  for  target 
mode  estimation  in  discrete  time  is  presented  in  this  paper.  A  multidimensional  point  process 
filter  is  obtained  by  making  use  of  the  discrete  time  point  process  theory  and  its  utilization  is 
illustrated  through  simulation  examples. 

10.  A  New  Approximation  for  the  Partially  Observed  Jump  Linear  Quadratic  Problem  (C.  Yang 
and  M.  Mariton,  submitted  to  Int'l.  Journal  of  Systems  Science.  Oct.  1989). 

We  consider  the  Jump  Linear  Quadratic  Problem  where  linear  state  dynamics  are  made 
contingent  upon  the  Markovian  transition  of  a  regime  variable.  It  is  desired  to  regulate  the 
state  while  minimizing  a  quadratic  performance  index.  In  the  case  of  partial  observations  the 
exact  solution  has  proved  to  be  elusive  and,  in  this  paper,  we  present  a  new  approximation 
based  on  the  optimal  solution  of  an  averaged  version  of  the  original  problem.' 
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I.  .Introduction 

In  this  contribution  we  present  a  novel  approach  to  ihe  problem  of 
filtering  for  a  linear  system  with  Markovian  coefficients 

x,~a(B,)x,^  +  b(9,)w,  (!) 

with  observations 

>-(  =  A(9,)af,+g(d,)y,  (2) 

8,  is  a  finite  state  Markov  chain  taking  values  in  { 1 ,  •  •  ■ ,  N]  according  to 
a  transition  probability  matrix  H,  and  w,,  v,  are  mutually  independent 
white  Gaussian  processes.  The  exact  filter  consists  of  a  growing  number 
of  linear  Gaussian  hypotheses,  with  the  growth  being  exponential  with  the 
time.  Obviously,  for  filtering  we  need  recursive  algorithms  whose 
complexity  does  not  grow  with  time.  With  this,  the  main  problem  is  to 
avoid  the  exponential  growth  of  the  number  of  Gaussian  hypotheses  in  an 
efficient  way. 

This  hypotheses  management  problem  is  also  known  for  several  other 
filtering  situations  [10],  [3],  [6],  [9),  and  [4],  All  these  problems  have 
stimulated  during  the  last  two  decades  the  development  of  a  large  variety 
of  approximation  methods.  For  our  problem  the  majority  of  these  are 
techniques  that  reduce  the  number  of  Gaussian  hypotheses,  by  pruning 
and/or  merging  of  hypotheses.  Well-known  examples  of  this  approach  are 
the  detection  estimation  (DE)  algorithms  and  the  generalized  pseudo 
Bayes  (GPB)  algorithms.  For  overviews  and  comparisons  see  [J4],  [7], 
[12),  and  [17].  None  of  the  algorithms  discussed  appeared  to  have  good 
performance  at  modest  computational  load.  Because  of  that,  other 
approaches  have  been  also  developed,  mainly  by  way  of  approximating 
the  model  (1),  (2).  Examples  are  the  modified  multiple  model  (MM) 
algorithms  [20],  [7],  the  modified  gain  extended  Kalman  (MGEK)  filter  of 
Song  and  Speyer  [13],  [7],  and  residual  based  methods  [19],  [2].  These 
algorithms,  however,  also  lack  good  performance  at  modest  computa¬ 
tional  load  in  too  many  situations.  In  view  of  this  unsatisfactory  situation 
and  the  practical  importance  of  better  solutions,  the  filtering  problem  for 
the  class  of  systems  (1),  (2)  needed  further  study. 

One  item  that  has  not  received  much  attention  in  the  past  is  the  timing  of 
hypotheses  reduction.  It  is  common  practice  to  reduce  the  number  of 
Gaussian  hypotheses  immediately  after  a  measurement  update.  Indeed,  on 
first  sight  there  does  not  seem  to  be  a  better  moment.  However,  in  two 
recent  publications  [3],  [I],  this  point  has  been  exploited  to  develop, 
respectively,  the  so-called  IMM  (interacting  multiple  model)  and  AFMM 
(adaptive  forgetting  through  multiple  models)  algorithms.  The  latter 
exploits  pruning  to  reduce  the  number  of  hypotheses,  while  the  IMM 
exploits  merging.  The  IMM  algorithm  was  the  reason  for  a  further 
evaluation  of  the  timing  of  hypotheses  reduction.  A  novel  approach  to 
hypotheses  merging  is  presented  for  a  dynamic  MM  situation,  which  leads 
to  an  elegant  derivation  of  tbe  IMM  algorithm.  Next  Monte  Carlo 
simulations  are  presented  to  judge  the  state  of  the  art  in  MM  filtering  after 
the  introduction  of  the  IMM  algorithm. 

n.  Timing  of  Hypotheses  reduction 

To  show  the  possibilities  of  timing  the  hypothesis  reduction,  we  start 
with  a  filter  cycle  from  one  measurement  update  up  to  and  including  the 
next  measurement  update.  For  this,  we  take  a  cycle  of  recursions  for  the 
evolution  of  the  conditional  probability  measure  of  our  hybrid  state 
Markov  process  (x,,  8,).  This  cycle  reads  as  follows: 

Minin* 

(3) 
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if  P{0/|  V,_  j }  =0  prune  hypothesis  6t, 

MUin* 

PM9.-I,  y,- il - *“  Plx,.t\elt  Y,.t]  (4) 

p{x,-i\0„  Y,. |]  p[x,\8„  r,.|]  (5) 

/’{<'/!>'<-.}  (6) 
Plx,\e„  y,_,]  -^5-pUoitf,,  y, ].  (7) 

For  output  purposes,  we  can  use  the  law  of  total  probability 

p[x,\ JH-SpMA-/,  Y, )/>{«, =«| Y,).  (8) 

i 

Let  us  take  a  closer  look  at  the  derivation  of  the  above  cycle.  As  u,  and  w, 
are  mutually  independent,  the  Bayes  formula,  which  represents  (6)  and 
(7),  follows  easily  from  (2).  From  the  evolution  of  system  (1)  follows  (5). 
The  Chapman-Kolmogorov  equation  for  the  Markov  chain  8, 

'-i  y,.i  } « £  Hup{o.-\=n  y-x)  o) 

J 

which  represents  (3),  can  be  seen  as  a  ‘•mixing."  To  derive  a 
representation  of  (4)  we  first  introduce  the  following  equation  on  the  basis 
of  the  law  of  total  probability: 

p[x, K, -.]=£[/>[*, -i|6,-i=M=<.  I'.-i) 

) 

■  P{B,.x=j\e,=i,  n-.}).  (10) 

As  8,  is  independent  of  x,.t  if  8,-t  is  known,  we  easily  obtain 

p(ar,.,|«,.,=y,  8,=i,  J',.  i]  =/>(*,- 1|  A- ia/.  K,.,). 

Substitution  of  this  and  of  the  following: 

P{8,.  I  =7 1 A =  •  }  =  HjgPfa- 1  -/|  Y,. .  )/P{8, = r  I Y,. , } 

in  (10)  yields  the  desired  . .  e.  ion  of  transition  (4) 

p[x,.\\6,=i,  )',.,]  =  £ 1  ‘  i=/l  y-i} 
i 

•p[X„  ,|A-.=/\  Y,.,VPl$,ml\Y,.  |).  (II) 

Notice  that  the  mixing  of  the  densities  in  (1 1)  is  explicitly  related  to  the 
above-mentioned  Markov  properties  of  6,  and  the  conditional  indepen¬ 
dence  of  8,  and  xt.\,  given  A-i-  According  to  the  above  filtering  cycle 
there  are  at  any  moment  in  time  tV  densities  on  R "  and  N  scalars.  The 
densities  on  R"  are  rarely  Gaussian.  Even  if  p[Xo|  Jo]  is  Gaussian,  then 
P\Xt\8t  =  i,  Y,)  is  in  general  a  sum  of  N'~'  weighted  Gaussians 
(Gaussian  mixture).  Explicit  recursions  for  these  N1  individual  Gaussians 
and  their  weights  can  simply  be  obtained  from  the  above  filter  cycle. 
Obviously,  the  N  times  increase  of  the  number  of  Gaussians  during  each 
filter  cycle  is  caused  by  (4)  only. 

In  the  sequence  of  elementary  transitions,  (3)  through  (7),  we  can  apply 
a  hypotheses  reduction  either  after  (4),  after  (5),  or  after  (7).  We  review 
these  reduction  timing  possibilities  for  the  fixed  depth  merging  hypotheses 
reduction.  This  fixed  depth  merging  approach  implies  that  the  Gaussian 
hypotheses,  for  which  die  Markov  chain  paths  are  equivalent  during  the 
recent  past  of  some  fixed  depth,  are  merged  to  one  moment-matched 
Gaussian  hypothesis.  The  degrees  of  freedom  in  applying  this  fixed  depth 
merging  approach  are  the  choice  of  the  depth,  d  (a  1),  and  the  moment  of 
application.  If  the  applicauon  is  immediately  after  each- measurement 
update  pass  (7),  it  yields  the  GPB  (d  +  1)  algorithms  [14],  (16),  In  the 
next  section  we  derive  the  IMM  algorithm  by- applying  the  fixed  depth 
merging  approach  with  depth,  d  =  1,  after  each  pass  of  (4).  It  can  easily 
be  verified  that  all  other  timing  possibilities  yield  disguised  versions  of 
IMM  and  GPB  algorithms.  Merging  after  (5)  with  d  =  1  yields  a 
disguised  but  more  complex  IMM  algorithm.  Merging  either  after  (4)  or 
after  (5)  with  d  a  2  yields  a  disguised  but  more  complex  GPBd 
algorithm. 


in.  The  IMM  ALGORITHM 

The  IMM  algorithm  cycle  consists  of  the  following  four  steps,  of  which 
the  first  three  steps  are  illustrated  in  Fig.  1. 

1)  Starting  with  the  N  weights  fift  -  1),  the  N  means  k,(t  -  1)  and 
the  N  associated  covariances  -  1‘,  one  computes  the  mixed  initial 
condition  for  the  filter  matched  to  6,  =  /,  according  to  the  following 
equations: 

AC  0=2  Hijfai!- 1).  if  AM  =  0  prune  hypothesis  /,  (12) 

} 

#<t- 1)  =  2  Hum-  »*/'-  >)/AM.  (13) 

J 

/?'(/-  d=2  Hum- mf-v- v+m- on-  .n/AM. 

i 

(14) 

2)  Each  of  the  iV  pairs  £'(t  -  1),  f?'(r  -  1)  is  used  as  input  to  a 
Kalman  filter  matched  to  8,  =  /.  Time-extrapolation  yields,  j? ,(r),  R/(t), 
and  then,  measurement  updating  yields,  kft),  /?*(/). 

3)  The  N  weights  A(0  arc  updated  from  the  innovations  of  the  N 
Kalman  filters, 

AM=C  •  am  •  IIC/MII- 1,2  exp  { -  (15) 


with  c  denoting  a  normalizing  constant 

AM=A-A(/MM  (16) 

G/M = h(i)R,(t)hT(i)+g(i)g  r(i).  (17) 

4)  For  output  purpose  only,  X,  and  /?,  are  computed  according  to 

A-£AMAM  '  (18) 

/ 

rt.-£AM[AM+[AM-A][.  .JTJ.  (19) 

i 


Only  step  1)  is  typical  for  the  IMM  algorithm.  Specifically,  the  mixing 
represented  by  (13)  and  (14)  and  by  the  interaction  box  in  Fig.  1,  cannot 
be  found  in  the  GPB  algorithms.  This  is  the  key  of  the  novel  approach  to 
the  timing  of  fixed  depth  hypotheses  merging  that  yields  the  IMM 
algorithm.  We  give  a  derivation  of  the  key  step  1). 

Application  of  fixed  depth  merging  with  d  =  1  implies  that 

p[x„,  10,.,  mi,  y,.,1~M{M-  1),  A,(t- 1)}. 

Substitution  of  this  in  (1 1)  immediately  yields  (13)  and  (14),  with 
-f'(r-l)  £  £{*.,|A-/f  Y,.,} 
and 

/?'(r- 1) 

the  associated  covariance.  Finally,  we  introduce  the  approximation. 
p[x,.,\8,=i,  Y,.,]~N{m-\),  R‘(t- I)} 

which  guarantees  that  all  subsequent  IMM  steps  fit  correctly. 

Remark:  The  IMM  can  be  approximated  by  the  GPB1  algorithm  by 
replacing  £t(t  -  1)  and  /?,(/  -  1)  in  step.  1)  by  A-  \  and/?,.  |.  Together 
with  (12)  this  approximates  (13)  and  (14)  in  step  1)  by,  £‘{t  -  1)  =  A-i 
and  /?'(/  -  1)  -  7?,.| .  These  equations  are  equivalent  to  (13)  and  (14)  if. 
each  component  of  //equals  l  IN,  which  implies  that  8,  is  a  sequence  of 
mutually  independent  stochastic  variables.  The  latter  is  hardly  ever  the 
case  and  we  conclude  that  the  reduction  of  the  IMM  to  GPB1  leads  to  a 
significant  performance  degradation.  Obviously,  the  computational  loads 
of  IMM  and  GPB1  are  almost  equivalent. 
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Fig.  I.  The  1MM  algorithm. 

IV.  PERFORMANCE  OF  THE  IMM  ALGORITHM 

Presently  a  comparison  of  the  different  filtering  algorithms  for  systems 
with  Markovian  coefficients  with  respect  to  their  performance  is 
hampered  by  the  analytical  complexity  of  the  problem  [16],  [15],  Because 
of  this,  such  comparisons  necessarily  rely  on  Monte  Carlo  simulations  for 
specific  examples.  For  our  simulated  examples  we  used  the  set  of  19  cases 
that  have  been  developed  by  Westwood  [18].  To  make  the  comparison 
more  precise,  we  specify  these  cases  and  summarize  the  observed 
performance  results.  In  all  19  cases  both  x,  and  y,  are  scalar  processes, 
which  satisfy  x,  -  a(0,)x,.,  +  b(ff,)w,  u(t)  and  y,  =  h(8,)x,  + 
g(9,)v„  with  =»  {0,  1},  «(/)  =  10.  cos  {2r//100),  x0  a  Gaussian 
variable  with  expectation  10  and  variance  10,  P{60  =  1}  =  P{d0  =  0} 
=  1/2,  while  Hoo  =  (1  -  l/r0)  and //n  =  (1  -  1/ri).  The  parameters 
<7,  b,  ft,  g  and  the  average  sojourn  times  ra  and  rt  of  these  19  cases  are 
given  in  Table  I. 

The  results  of  Westwood  1 18]  show  that,  in  all  19  cases  the  differences 
in  performance  of  the  GPB2  and  the  GPB3  algorithms  are  negligible, 
while  in  only  seven  cases  (5,  6,  8,  16,  17,  18,  19)  the  differences  in 
performance  of  the  GPBl  and  the  GPB2  algorithms  are  negligible.  To  our 
present  comparison  the  other  12  cases  (1,2, 3,4,  7,9,  10, 11, 12,  13, 14, 
15)  are  interesting.  For  each  of  these  I2cases  we  simulated  the  GPB1,  the 
GPB2,  and  the  IMM  algorithms  and  van  Monte  Carlo  simulations, 
consisting  of  100  runs  from  /  =  0  to  /  =  100.  For  simplicity  of 
interpretation  of  the  results  we  used  one  fixed  path  of  6  during  all  runs:  0 
=  0  cn  thetime  interval  [0, 30],  0  =  1  on  the  interval  [3 1 ,60],  and  0  =  0 
on  the  interval  [61,  100], 

The  results:- of- cur  simulations  for  the  12  interesting  cases  are  as 
follows.  In  six  cases  (1,  2,  7,  12,  14,  15)  both  the  IMM  and  the  GPB2 
performed  slightly  better  than  the  GPBl,  while  the  IMM  and  (lie  GP82 
performed  equally  well.  For  typical  results,  see. Fig.  2.  In  the  other  six 
cases  both  the  IMM  and  the.GPB2  |Hrformed  significantly  better  than  the 
GPBl.  For  typical  results  see  Figs.  3  and  4,  Of  these  six  cases  the  IMM 
and  the  GPB2  performed  four  times  equally  well  (cases  3,  4,  11,  and  13) 
and  two  times  signlficahtiy  different  (cases  9  arid  10). 

On  the  basis  of  these  simulations  we  can  conclude  that  the  IMM 
performs  almost  as  well  as. the  GPB2,  while  its  computational  load  is 
about  that  of  GPBl.  We  can  further  differentiate  this  overall  conclusion. 

•  Increasing  the  parameters  ra  and  rt  increases  the  difference  in 
performance  between  GPBl  andGPB2,  but  not  between  IMM  and  GPB2. 

•  If  a  is  being  switched,  then  the  IMM  performs  as  well  as  the  GPB2, 
while  the  GPBl  sometimes  stays  significantly  behind. 

•  If  the  white  noise  gains,  b  or  g,  are  being  switched,  then  the  IMM 
performs  as  weli  as  the  GPB2,  while  the  GPBl  sometimes  stays 
significantly  behind. 

0  If  only  h  is  being  svyitched,  then  in  some  cases  the  IMM,  and  even 
more  often,  the  GPBl  tend  to  diverge  while  the  GPB2  works  well. 

Another  interesting  question  is  how  the  IMM  compares  to  the  modified 
MM  algorithm  and  the  MGEK  filter.  Apart  from  the  GPB  algorithms, 
Westwood  [18]  also  evaluated  four  more  filters,  the  MM,  the  modified 
MM,  the  MGEK,  and  a  MGEK  with  a  “postprocessor."  For  the  19  casej 
there  was  only  one  algorithm  that  outperformed  the  GPBl  algorithm  in 
some  cases.  It  was  the  MGEK  filter  in  the  cases  1, 3,  and4.  He  also  found 
that  the  MGEK  filter  performed  in  these  cases  marginally  or  significantly 
less  good  than  the  GPB2  algorithm.  As  the  above  experiments  showed  that 


TABLE  I 

THE  PARAMETERS  OF  THE  19  CASES  OF  WESTWOOD  [18] 


CASE 

H-VALUES 

0-DEPENDENT  VALUES 

# 

TO 

T| 

»|0).<tlt 

blO).b(l) 

h(0),h|1) 

sto) .  jit) 

t 

40 

20 

.995,. 990 

t.O 

t.O 

1.0 

2 

40 

20 

.995, .990 

.5 

t.O 

.5 

3 

40 

20 

.995, .990 

.1 

t.O 

5.0 

4 

200 

too 

.995, .990 

.1 

t.O 

5.0 

5 

40 

20 

.995, .990 

8.0 

t.O 

t.O 

6 

40 

20 

.995, .990 

t.O 

1.0 

.3 

7 

40 

20 

.995, .900 

.5 

1.0 

2.0 

8 

40 

20 

.995, .750 

t.O 

t.O 

.6 

9 

40 

20 

.595 

2.0 

1.0..95 

.5 

to 

40 

20 

.995 

t.O 

t.O, .80 

.2 

It 

40 

20 

.995 

.5 

!.0„80 

.8 

12 

4 

2 

.995 

.5 

t.O, .80 

.8 

13 

200 

too 

.995 

.5 

t.0,,80 

.8 

14 

40 

20 

995 

.1.5.0 

1.0 

1.0 

IS 

40 

20 

.995 

1.0 

1.0 

.1,5.0 

16 

10 

2 

.95 

.5 

t.0,0.0 

1. 0,2.0 

17 

200 

5 

.950,0.0 

1.0 

1.0 

1.0 

18 

50 

5 

.950,1.2 

1.0 

t.O 

1.0 

19 

10 

2 

.95 

.5 

1.0 

1.0,40.0 

Fig.  2.  rms  error  for  case  7,  illustrative  of  the  si*  cases  (1 , 2, 7,  12, 14,  IS)  where  both 
IMM  and  GPB2  perform  slightly  better  than  GPBl. 


Fig.3.  muerrofforc*se3,il!ustr»tiveof(hefourc*ies(3,4,  II,  13)  where  both  IMM 
and  GPB2  perform  better  than  GPBl,  while  IMM  and  GPB2  perform  equally  well. 


for  cases  1, 3,  and  4  the  GPB2  and  the  IMM  algorithm  performed  equally 
well,  one  can  conclude  that  the  MM,  the  modified  MM,  the  MGEK,  the 
MGEK  with  “postprocessor,”  and  the  GPBl  are  in  ail  19  cases 
outperformed  by  the  IMM  algorithm. 

On  the  basis  of  these  comparisons  one  can  conclude,  that  for  practical 
filtering  applications  with  N  =  2,  the  IMM  algorithm  is  the  best  first 
choice.  As  the  IMM  algorithm  has  been  developed  on  the  basis  of  some 
general  hypotheses  reduction  principles,  which  are  TV-invariant,  one  can 
reasonably  expect  that  this  is  also  true  for  larger  N.  But  it  is  unlikely  that 
the  IMM  performs  in  all  applications  almost  as  good  as  the  exact  filter. 
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Fig.  4.  mu  error  for  cue  9,  illustrative  of  the  two  cues  (9  snd  10)  where  1MM 
performs  better  then  GPB1,  but  slightly  worse  than  GPB2  (in  these  two  cases  only  A 
jun.pt). 


Therefore,  if  the  IMM  performs  not  well  enough  in  a  particular 
application  one  should  considet  using  a  suitable  GPB  (a  2)  or  DE 
algorithm  [14],  or  one  might  try  to  design  a  better  algorithm  by  using 
adaptive  merging  techniques  [16].  The  DE  algorithm  might  possibly  be 
..nproved  by  the  novel  timing  of  hypotheses  reduction  [1].  If  for  a 
particular  application  the  performance  of  the  selected  algorithm  has  a  too 
high  computational  load,  then  it  is  best  to  try  to  exploit  some  geometrical 
structure  of  the  problem  considered  [2],  [11], 

In  situations  where  estimation  has  to  be  done  outside  some  time-critical 
control  loop,  it  is  usually  preferable  to  use  a  smoothing  algorithm  instead 
of  a  filtering  algorithm  (85,  [14],  [21],  In  view  of  the  above  filtering 
results,  this  suggests  that  the  ideas  that  underly  the  IMM  algorithm  can  be 
exploited  to  develop  better  smoothing  algorithms. 
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Abstract 

An  area  of  current  Interest  Is  the  estimation  of 
the  state  of  discrete-time  stochastic  systems  with 
parameters  which  may  switch  among  a  finite  set  of 
values.  The  parameter  switching  process  of  interest 
is  modeled  by  a  class  of  semHMarkov  chains.  This 
class  of  processes  is  useful  in  that  it  pertains  to 
many  areas  of  Interests  such  as  the  failure  detection 
problem,  the  target  tracking  problem,  socio-economic 
problems  and  In  the  problem  of  approximating 
nonlinear  systems  by  a  set  of  linearized  models.  It 
Is  shown  in  this  paper  how  the  transition 
probabilities,  which  govern  the  model  switching  at 
each  time  step,  can  be  inferred  via  the  evaluation  of 
the  conditional  distribution  of  the  sojourn  time. 
Following  this,  a  recursive  state  estimation 
algorithm  for  dynamic  systems  with  noisy  observations 
and  changing  structures,  which  uses  the  conditional 
sojourn  time  distribution,  is  derived. 

1.  Introduction 

In  this  paper  we  are  concerned  with  failure 
detection  via  recursive  estimation  of  parameters  in 
discrete-time  dynamic  systems.  The  topic  of  interest 
is  stochastic  systems  with  abruptly  changing 
parameters  i.e.,  model  jumps.  The  recursive  state 
estimation  algorithm  for  this  problem  developed  in 
this  paper  provides  the  conditional  model 
probabilities  used  for  detecting  the  change  in  system 
parameters  which  signify  component  failures. 

The  abruptly  changing  parameters,  which  switch 
among  a  finite  set  of  values,  are  modeled  as  a  Markov 
or  a  semi-Markov  chain  with  known  transition 
statistics  (M2.M3.M5-M8.G1I.  Although  the  Idea  of 
semi-Markov  chains  Is  appropriate  for  the  model 
concerned,  the  analysis  presented  in  the  above  is 
actually  only  for  Markov  chains  (since  the  transition 
probabilities  were  assumed  fixed  and  the  transitions 
depended  only  on  the  latest  slate  -  see  Eq.  (8)  in 
(M2j),  The  process  considered  in  this  paper  is  of 
the  semi-Markov  type  and  pertains  to  many  areas  of 
Interest.  A  failure  In  a  component  of  a  dynamical 
system  can  be  represented  by  a  sudden  change  in  the 
systems  parameters  (BS.Sl.Wlj.  Also,  a  repair  to  a 
system  represents  a  change  in  the  parameters  [B5|. 
Other  areas  that  this  class  of  processes  pertains  to 
are  the  target  tracking  problem  (31),  socio-economic 
problems  IG2]  and  the  technique  of  approximating 
grossly  nonlinear  systems  by  a  set  of  linearized 
models  (M4.V1.V21. 

The  first  treatment  of  estimation  In  a  switching 
environment  was  In  (AH  where  the  means  and 
covariances  of  the  process  and  measurement  noises 
experienced  Jumps.  As  Indicated  In  (Cl),  the  optimum 
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state  estimation  in  a  multiple  model  environment  is  a 
function  of  the  elemental  ("model-matched")  state 
estimates  obtained  via  estimators  tuned  to  all 
possible  parameter  histories.  Thus,  with  time,  the 
estimator  must  keep  track  of  an  exponentially  growing 
number  or  parameter  history  hypotheses.  Even  In  the 
case  of  Markov  switching  the  estimation  algorithm 
requires  exponentially  growing  memory  (Tl,  T2). 
Suboptima!  algorithms  like  the  Generalized 
Pseudo-Bayeslan  Algorithm  (GPB)  ( A 1,  Cl,  T2)  and  the 
Interacting  Multiple-Model  Algorithm  (IMM)  (82.  B3. 

B4)  are  viable  approaches  to  obtain  a  real-time 
implemenlable  estimation  algorithm.  These  algorithm; 
rely  on  different  hyopothesis  merging  techniques  to 
limit  the  memory  and  computational  requirements  |B4), 

In  IS2.C2)  a  semi-Markov  switching  problem  was 
considered,  but  the  jumps  were  assumed  to  be 
perfectly  observed.  In  |M9)  an  estimation  scheme  for 
semi-Markov  processes  was  developed  based  upon  the 
detection-estimation  algorithm  (OEA).  This  approach 
is  obtained  by  retaining  a  certain  number  of  most 
likely  parameter  history  hypotheses.  The  estimation 
schemes  based  upon  the  DEA  (which  discards  all  but  a 
number  or  most  likely  history  hypotheses)  and  the  GPB 
or  IMM  (which  use  hypothesis  merging)  algorithms 
represent  different  philosophies  of  algorithm 
design.  We  present  an  example  comparing  the  two 
methods  for  a  particular  slate  estimation  problem 
later  in  this  paper. 

The  problem  is  formulated  in  Section  2.  ,  In 
Section  3  the  sojourn  lime  conditional  probability 
mass  functions  and  the  conditional  transition 
probabilities  which  we  derived  in  (Mia),  are  given 
here  for  clarity  and  ease  of  reference.  The 
inclusion  of  Section  4,  the  state  estimation 
algorithm  which  was  developed  in  (Mlb),  is  for  the 
sake  of  completeness.  In  Section  S  simulations  are 
presented.  Preliminary  results  on  this  problem  were 
presented  in  (Mia,  Mlb), 

2.  Formulation  of  the  problem 

The  system  Is  modeled  by  the  equations 

x(k)  *  F[M(k)]  x(k-l)  *  v(k-l,  M(k))  (2.1) 

z(k)  ■  H( M(k ) I  x(k)  ♦  wlk.M(k))  (2.2) 

where  M(k)  denotes  the  model  "at  time  k"  -  in  effect 
during  the  sampling  period  ending  at  k.  The  process 
and  measurement  noise  sequences,  v(k)  and  w(k),  are 
white  and  mutually  uncorrelated. 

The  model  at  time  k  is  assumed  to  be  among  the 
possible  r  models 

Mlk)  e  (1 . r)  (2.3) 

For  example 

F(M'k).J)  «  Fj  (2.4) 

v(k-!,M(k)«j)  ~  iV(Uj,  Q,)  (2.5) 

I.e.,  the  structure  of  the  system  and/or  the 
statistics  of  the  noises  might  be  different  from 
model  to  model.  The  mean  Uj  of  the  noise  can 
model  a  maneuver  or  a  system  failure  as  a 
deterministic  Input. 

The  model  switching  process  to  be  considered  here 
Is  of  the  semi-Markov  type.  The  process  Is  specified 
by  a  family  of  transition  matrices  p(j(  rt). 

I.e..  It  is  a  "sojourn-tlme-dependent  Markov"  (STOM) 


chain,  which  belongs  to  the  semi-Markov  class.  The 
specification  of  the  STDM  chain  is  more  closely 
related  to  physical  models  because  It  does  not.  have 
the  artificial-restart  of  the  sojourullme  counting 
of  the  semi-Markov  process  for  virtual  transitions* 
and  can  capture  Important  features  in- many  realistic 
situations. 

For  the  class  of  semi-Markov  chains  governing  the 
evolution  of  the  system's  model  considered  here,  we 
need  the  pdf  of  the  sojourn  time  conditioned  on  the 
observations,  to  Infer  the  transition  probabilities. 

The  conditional  transition  probabilities  based  on 
noisy  observations  of  the  system's  state  are  obtained 
in  the  next  section. 


(3,1)  is  the  sojourn  time  while  the  argument  of 
p  defined:  ^bove  Is  the  current  time 
The  conditional  probability  mass  function  (3.3) 
of  the  sojotirrf  time  r  in  state  -l  at  time  k  Is 
given  by  the  following  expressions 


g‘|sf 


p.lk--!!’ 

8?<{)  ‘  1  '  aJOT  bi,'k'U 

Us(k-s|  "li-i  plk-ml 

1  ■  iiZJT  b‘(k's>  j2  tfcZT 

s=2..  ,.k 
i  Mk-m) 

8>(krli  ‘  1!,  bi(k,ml 
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(3.7) 

(3.8| 


’A  semi-Markov  (SM)  chain  |HI,  112,  fill  is 
characterized  by  a  fixed  matrix  of  transition 
probabilities  (p  I  and  a  matrix  of  sojourn 
lime  probability  density  functions 
(fj:(Tj)l:  which  are  functions  of  the 
current  state  i  as  well  as  the  destination  state  j 
of  the  transition.  In  a  SM  chain  first  the 
destination  of  the  jump  is  chosen  according  to 
lp.^1  and  then  the  time  after  which  the  jump 
takes  place  (l.e.,  the  sojourn  time)  is  chosen 
according  to  ( f.  ( Tj) j.  In  this  model  the 
process  can  undergo  a  virtual  transition  (i.e,  jump 
"in  place"  if  j*il;  however,  in  this  case,  the 
sojourn  time  counting  is  still  restarted  even  though 
the  system  has  been  in  stale  i  for  some  time 

3.  Sojourn  Time  Probability  Mass  Functions  and 
Conditional  Transition  Probabilities 

The  process  M(k),  k*0.1 .  which  represents 

the  system  model,  can  exist  in  one  of  r 
possible  states.  The  current  probabilities  of 
transition  for  the  STOM  process  (chain)  are  functions 
of  the  sojourn  time  r  and  are  defined  as 

pjr)  -  P(M(k)«j|M(k-lM,Tj(k-l)*T)  (3.1) 

where  rt(k-il  is  the  sojourn  lime  in  slate  i  at 
time  k-I.  It  is  assumed  that  at  k«0  the  sojourn 
lime  (in  whatever  stale  the  system  model  is)  is 
r*l  ,  Thus  the  values  r  can  take  are  from  1 
to  the  maximum,  which  at  time  k-l  is  then  k 

Let  z(i)  he  a  noisy  measurement  of  the  stale  of 
the  dynamic  system  whose  model  undergoes  transitions 
according  to  the  above  described  STDM  process  8ased 
on  the  available  information 
Zk  *{  z  (  x  ) the  probability  of  the 
model  process  being  in  statu  i  ,  denoted  as 
p.(k)  ,  is  defined  as 

l».(k)=P(M(k)=ilZk)  i*l . r  (3,2) 

The  conditional  pmf  of  the  sojourn  time  in  state 
M(k)*i  based  on  the  available  information  Zk  at 
time  k  is 

gf(r)  ±  P{Tj(k]«r|M(k)*i.Zk)  ■=  P{Tj(k  1 « r |M(k I *i.Zk’‘ } 

«  P{Mlk-l)*i . M(k-T*l)*i.M(k-T)yilM(k)«i,Zk"')  (3.3) 

where  the  perfect  knowledge  of  the  state  M(k) 
allows  one  to  go  down  to  one  index  less  in  the 

conditioning,  l.e.,  Zk’’. 

Following  (3.1  j  the  conditional  probability  of 
transition  from  i  to  j  at  time  k-I  given  the 
observations  Zk‘l  is,  in  terms  of  (3  3), 
p  (k-I)  =  P(M(k)»J!M(k-l)«i,2k'1} 

-  I  P{M(k)«J|M(k-U«i,Tj(k-l j«T,Zk'') 

t*l 

•P{Ti(k-l)«T|M(k-lH,Zk'‘) 

*  Z  P„(t)  gj"'(T )  13.S) 

nl  ■» 

Note  that  the  argument  of  p ..  defined  In 


Expressions  (3.6)-(3.8)  are  proven  by  induction  in 
[Mini.  The  rotations  af  and  b,  used  above  are 
defined  below 

The  probability  that  the  process  will  stay  s 
time  steps  in  the  same  state  i  as  it  is  at  time 
k-s  is.  conditioned  on  the  information  at  k-s. 
given  by  the  expression 

b.(k.s)  =  P(M(k)=i„  ,M(k-s»l)=i!M(k-s)=i,Zt'i) 

■  “i”  TpJU  8:k'*tn)  s=i . k  13  9) 

n:l  j:n 

Conditioned  on  the  available  information 
Zk‘5  at  lime  k-s.  the  joint  probability  of 
the  process  residing  in  the  same  slate  i  for  the 
next  s  time  slaps  is  denoted  as 
a,(k.s)=P(M(k)=i . M|k-s*l)=ilZl's) 

*  £p(M(k)=i.  ,M(k-s*l)=i|M(k-s)3 j,Zk’1)P(M(k-s)*jlZk'5) 
i=i 

=  b  jk.siP  (k-s)  ♦  £p(M(k)=i„  ..M(k-s»l)=ilM(k-s)  =  j.Z1*"') 

*  v  yi 

•lUk-s) 

=  b;(k,s)  u.(k-s) 

♦  Z  jk*Z*P{M(k)=i. . ,M(k-s»l)=ilM(k-s)=j,Tj(k-s)=n,Zk'4} 

>/i  l  n"l 

•8 ‘"’(n)  jUjik-sl 

=  b;(k,s)  (  k  -  s  ) 

♦  zTI'p  lnlP-f«)Pii(2)...p..ls-.,  >k-s(n)  J  U  (k-s) 

j/i  L  "=l  11  "  -J 

=  bjk.s)  P,(  k-  s ) 

♦  Z  PzVlnjff  P -( > )g jk”*f n)  j  p(k-s) 

j/i  L  ml  11  1=1  "  ’  J  ’ 

s»l . k  13.10) 

4.  The  state  estimation  algorithm 

As  indicated  in  Sec.  I,  the  optimal  estimator  for 
linear  systems  with  Markov  model  jumps  requires  an 
exponentially  increasing  memory.  Among  the 
suboptimal  approaches  discussed,  it  appears  that  the 
IMM  is  the  most  cost-effective  in  implementation 
[Bdj.  In  view  of  this,  the  slate  estimation  for  a 
linear  system  with  sojourn-time-dependent  transition 
probabilities  Is  developed  in  the  sequel  based  on  the 
IMM  approach. 

In  this  approach,  at  time  k  the  state 
estimation  is  computed  under  each  possible  model 
hypothesis  using  r  filters  (for  the  r  possible 
models),  with  each  filter  using  a  different 
combination  of  the  previous  model-conditioned 
estimates.  Each  model  transition  probability  Is  a 
known  function  of  the  sojourn  time  given  by  (3-1).. 

Each  model  has  a  sojourn  time  r,(k)  In  stale  i 
which  Is,  however,  not  known.  The  filter  has  access 
only  to  the  observations  from  which  the  conditional 
pmf  of  the  sojourn  time  (3.6M3.8)  can  be  obtained; 


this  in  turn  is  to  be  used  in  calculation  of  the 
conditional  transition  probabvilities  (3.5). 

To  find  the  conditional  pdf  of  the  state  of  the 
dynamic  system  described  by  (2.1)-(2.3)  the  total 
probability  theorem  is  used  as  follows: 

p|x(k)|7.ki  «  £  p(x(k)lM(k)«j,z(k),Zk''l  P(M(k)«jlZk) 

)=> 

*  £p[x(k)|M(k)-j.z(k).Zk',l  u.tk)  M  'I 

hi  -  1 

i.e.t  r  filters  running  in  parallel.  The 
model-conditioned  posterior  pdf  of  the  state,  can  be 
rewritten  as  (with  the  irrelevant  conditioning  on 
Zk'*  In  the  numerator  omitted) 
p|  x(kflli(li|M.2lkl,Zl',l 

«  P(z(k|lH(k|‘,1-x(t-111-  pIxlkllMIkM.Z1-')  (1.2) 

p|z(k)|M(k)»5,Zk  ') 

reflecting  one  cycle  of  the  state  estimation  filter 
matched  to  model  I  starling  with  the  prior,  which 
is  thr  last  term  above.  The  total  probability 
theorem  is  now  applied  to*- this  prior,  yie-Jing 
p[x(k||M(k)-J.Zk'') 

-  ip(x(k)|M(k)-i.M(k-l)*i.Zk*')P(M(.’<-l)-ilM(k)  =  i.Zk'') 
hi 

«  ip,„!k)|M(k)«J.M(k-l)ri.Zk‘,|  p  (k-l|k-l)  (1.3) 


p.(k)  £  P(M(k)*  jlZk) 


and 

p  (k-l|k-l)  £  P(M(k-i)»ilM(k)«i.Zk'')  (1.S) 

Note  that  Eq.  (1.3)  represents  a  Gaussian  mixture 
under  the  typical  Gaussian  assumptions  on  the  noise 
terms  In  Eqs.  ( 2.1 )  and  (2.2).  This  mixture  is  then 
approximated  by  a  single  moment-matched  Gaussian.’ 

Therefore  it  follows  that  the  input  to  the  filter 

matched  to  model  /,  j-1 . r,  is  obtained  from  an 

interaction  of  these  r  filters.  This  interaction 
consists  of  the  mixing  of  the  estimates 
x‘(k-llk-l)  according  to  the  weightings 
(probabilities)  p.,.(k-l|k-l ).  The 
evaluation  of  the  probabilities  (1.1)  and  (1.5)  in 
the  STOM  situation,  are  the  key  results  needed 
to  obtain  a  recursive  state  estimation  algorithm  for 


where  the  notations  from  (1.1)  and  (3.5)  were  used 
and 

xlk-llk-1)  =Elxlk-l)|M(k-l)«i.2l‘1l  (1.8) 

is  the  model-conditioned  state  estimate  at  time  k-i. 
The  expression  of  p„  for  the  STOM  case 
using  terms  involving  sojourn  time  probabilities  is 
the  one  obtained  In  (3.5).  The  covariance 
corresponding  to  (1.G)  is 

P°’(k-l|k-l)  -  iu,..(k-llk-!)(P'(k-l|k-l) 
hi 

•  |x,(k-l|k-l)-xoi(k-llk-U) 

•Ix'(k-l|k-l)-x°'(k-i|k-t))')  (1,9) 

The  estimate  (1.G)  and  covariance  (1.9)  are  used 
as  input  to  a  standard  Kalman  filter  matched  to 
M(k)«j  to  yield  the  model-conditioned  estimate 
x’(k|k)  and  its  covariance  P’(k|k). 

The  likelihood  functions  corresponding  to  the  r 
filters  are  computed  as 
Aj(k)  -  p|z(k)lM(k).j,Zk'1] 

~  p(z(k)|H(k)»j.x0,(k-l|k-l).P°*(k-l|k-l)l  (1,10) 

where  the  past  data  have  been  replaced  by  (1.G)  and 
(1.8)  according  to  the  key  step  of  the  IMM.  The 
model  probabilities  (1.1)  are  updated  as  follows: 

Plk]  -  P(M(k)«j|Zl)  •  iftjlkli  p.,(k-l)  u.(k-l)  i’ll) 

where  the  conditional  transition  probabilities, 
p  .  are  as  given  in  (1.8). 

Eqs.  (1.7)  and  (1.11)  in  combination  wit' 

P(j  are  the  key  results  that  make  possible 

the  state  estimation  for  a  system  with  sojourn-time- 
dependenl  model  transitions. 

Finally,  for  output  only,  the  latest  state 
estimate  and  covariance  are  obtained  according  to 
Eqs.  (1.1)  and  (1.3)  as 

x(k|k)  »  £  x’(klk)  p.(k)  (1,12) 

j=i  > 

P(k|k)  »  £  p.(k)(P’(k|k) 
i=i  1 

♦  (x’(klk)  -  x(k|k)](x’(k|k)  -  x(k|k))')  (1.13) 

5.  Simulation  Results 


this  type  of  model  switching.  These  probabilities  are 
shown  below  to  follow  from  the  results  in  Section  3. 

Fig.  1.1  describes  the  resulting  Interacting 
Multiple  Model  (IMM)  algorithm,  which  consists  of  r 
interacting  filters  operating  In  parallel.  The 
mixing  is  done  at  the  input  of  the  fillers  with  the 
probabilities,  detailed  later  in  (1.7),  conditioned 
o  n  ZkM  . 

One  cycle  of  the  algorithm  consists  of  the 
following: 

Starting  with  the  model-conditioned  estimate 
x(k-l|k-l),  with  associated  covariance 
P'(k-llk-l),  one  computes  the  mixed  initial 
condition  for  the  filter  matched  to  M(k)-|  according 
to  (1.3)  as  follows 

x0,(k-l|k-l)  -  £  x(k-l|k-l)M,..(k-l|k-l)  (1.6) 

ixl  “1 

From  H.5) 


l*„|(k-l|k-l)  -  i-P{M(k)«j|M(k-l)*i,Zk"')P(M(k-l i‘.ilZk’') 
*  =-  p..(k-I)  p.(k-l)  (1.7) 

cj  ‘1  1 


The  algorithm  developed  in  Sec.  1  using  the 
sojourn  time  pmf  obtained  in  Sec.  3  is  used  to 
estimate  the  state  of  the  system.  In  the  rirst 
example  the  results  of  this  STDM-based  IMM  estimation 
scheme  are  compared  with  results  obtained  from  an  IMM 
algorithm  based  upon  a  Markov  model  transition 
assumption.  In  the  second  example  the  STDM-based  IMM 
estimation  scheme  Is  compared  to  the 
detection-estimation  algorithm  of  (M9).  It  is 
assumed  th3t  an  STDM  process  described  in  Sec.  2 
governs  the  switching  between  models.  In  the 
following  T  is  the  sampling  period  and  k  Is  an 
integer  representing  the  number  of  sampling  periods 
since  time  zero. 

Example  1 

The  estimation  of  a  controlled  double  Integrator 
system  with  process  and  measurement  noises  is 
considered  with  a  gain  failure.  The  two  possible 
models  are  given  by  the  following  system  equation 

x'(kM)  •  [  J  [  ]  x'(k) 


♦ 

0 

u(k)  ♦ 

~t2/21  v(k) 

1-1,2  (5.1) 

_  b‘  . 

L  T  J 

'This  Is  the  key  step  of  the  IMM  that  yields  an 
algorithm  with  fixed  (and  modest)  computational 
requirements:  using  r  filters  It  yields  performance 
comparable  to  the  Generalized  Pseudo  Bayesian 
algorithm  with  r2  filters  (B1J. 


with  measurement  equation 

z(k)  «  (1  0)  x!(k)  *  w(k)  (5.2) 

The  models  differ  In  the  control  gain  parameter  b'. 

The  process  and  measurement  noises  are  mutually 
uncorrelated  with  zero  mean  and  variances 
given  by 

£(v(k)  v(J))  -  1-10'2  5kJ 


(5,3) 


E[w(k)  w(j)l  «  6kj  (5.1t 

The  control  gain  parameters  were  chosen  to  he  b'*2 
and  bJ*t. 

The  transition  probabilities  P„(t)  and. 

P2!(t)  defined  in  (3-11  are  shown  in  Fig. 

S-l.  Note  that  p ,.(  t  1.  for  i  K  i.  are  given 
<) 

by 

p.j(r)  *  I  -  Pt-( r J.  (5.5| 

Thus  we  see  that  P„(t)  is  initially  .5  and 
rises  rapidly  to  .99  and  then  decreases  towards  .1 
which  is  Its  steady  state  value  We  also  sec  that 
pn(T)  has  a  value  close  to  1.0  for  this  range 
of  r  and  thus  model  state  two  Is  essentially  an 
absorbing  state. 

Figs.  S-2  through  5-4  present  the  results  of 
100  Monte  Carlo  runs.  The  true  system  was  initially 
model  1  for  every  run  and  the  model  transitions 
occurred  according  to  the  probabilities  of  Fig.  S-l. 

For  simplicity,  since  we  are  mainly  interested  in  the 
estimation  of  the  state,  and  not  in  the  control 
strategy,  we  set  u(k)*3  for  all  k  . 

The  Markov  based  IMM  used  for  comparison  utilized 
the  a  priori  average  transition  probabilities 
pf  It),  obtained  by  taking  the  expected 
value  of  the  transition  probabilities  shown  in 
Fig.  S-l.  In  other  words,  the  conditional 
probability  p  from  (3.S)  is  replaced  by  the  a 
priori  (unconditional)  p  given  below  in  (S.7). 

The  probability  of  having  a  sojourn  time  r, 
equal  to  t  is  the  probability  that  model  i  is  in 
effect  for  t-1  steps,  and  then  a  transition  occurs 
at  step  r, 


p(vT>  *  [ff  p.ijji  -  P|l(Tij 

(5.61 

Thus  we  get 

P(i  =  £  p..(r)  P(t,=>t)  1*1.2 

(5.7a! 

and 

p„  *  1  -  p. 

'll  II 

(5.7b) 

Figs.  5-2  and  5-3  are  plots  of  the  RMS  error  in 
Xj(k)  and  x2(k)  respectively.  From  Fig.  5-2  we 
can  see  that  the  STOM-based  IMM  estimator  improves 
the  RMS  error  in  x,(k)  by  as  much  as  20  percent. 

From  Fig.  5-3  we  see  that  the  RMS  error  in  x2(k| 
of  the  STOM-based  IMM  estimator  is  as  low  as  one 
third  the  error  of  the  Markov-based  IMM  scheme.  Thus 
the  mean-square  error  Improved  by  an  order  of 
magnitude. 

Fig.  5-4  is  a  plot  of  the  average  model 
probability  error.  This  is  the  error  in  the  filter's 
determination  of  the  correct  system  model. 

Typical  running  times  for  the  STOM-based  IMM  vs. 
the  Markov-based  IMM  are  in  the  ratio  of  3:1.  The 
length  of  the  lime-span  over  which  the  sojourn  time 
pmf  is  computed  can  be  truncated  -  it  becomes 
negligible  after  15  steps.  This  keeps  within 
reasonable  limits  the  additional  calculations  of  the 
STOM-based  filter  and  prevents  any  growth  of  the 
computational  or  memory  requirements. 

Example  2 

In  this  example  we  make  a  comparison  between  the 
detection-estimation  algorithm,  (OCA),  based 
semi-Markov  estimator  of  (M91  with  the  STOM-based  IMM 
estimator  of  this  paper.  For  this  purpose  the  system 
and  the  semi-Markov  mode!  switching  process 
attributes  are  as  in  (M9)  example  3,  and  are  repeated 
here  for  ease  of  reference. 

The  model  process  M(k)  fs  taken  as  a  semi-Markov 
chain.  The  scalar  system  is  described  by  |M9J 

x (k *1 )  «  1.01  x(k)  ♦  vffcj 

z(k)  -  xfk)  ♦  0(M(k))w(k),  K-0,1,2,...  (S.S) 

where  r  «  3  models,  0(11*100,  0(2)«10,  and  0(31*1. 


Here  (v(k))  and  (w(k))  are  mutually  independent 
zero-mean  Gaussian  white  noise' sequences  with 
covariances  Q*0.i  and  R*!.0,  respectively.  The 
initial  conditions  are  x(0)~AH30t400).  P(M(0)*l)*l/3 
for  i *1,2,3.  For  the  real  system  x(0)*l  in  every 
simulation.  The  process  M(k)  is  modeled  by  a 
semi-Markov  chain  with  the  Imbedded  Markov  chain 
transition  probabilities  given  hyp|1*p2,*p33*0,  p|2*0.7. 

P13*0.3,  p3|*0.S,  p2J*0.<l.p31*0.3,  and  p32*0.7.  The  sojourn 
time  probability  mass  functions  p(l r )  are  assumed 
to  be 

p((r)  *  a,exp(-|T-3l! 

P3(t)  *  a2exp(-|T-6l| 


P3(r)  *  a3exp(-|r-8ll 
for  riO  with  a,  such  lh3t 

(5.9) 

ip.(T)=l.  i-1.2.3. 

j:0  1 

(5.10) 

The  results  of  50  Monte  Carlo  runs  average  are 
shown  in  Figs.  5-5,  5-6.  In  Fig.  5-5  we  compare  the 
rms  state  errors  of  the  two  filter  OCA  based 
semi-Markov  estimator  of  [M9I  with  our  two  filter  GP0 
based  semi-Markov  approach,  and  with  the  GP0 
estimator  using  3  filters.  Note  that  the  values  for 
the  0EA  estimator  are  two-time-step  smoothed  values 
(see  (M9I,  Fig.  7.  M*2  most  likely  histories 
retained)  whereas  the  values  for  the  STDM-IMM 
estimator  are  filtered  values.  We  can  see  that  our 
estimator  with  two  filters  is  stable  as  opposed  to 
the  unstable  two-filter  OEA  method. 

The  plot  of  the  3  filter  STDM-IMM  estimator  shown 
in  Fig.  5-5  is  given  so  that  one  can  compare  the 
improvement  obtainable  by  adding  an  extra  filter  to 
this  approach.  We  see  that  the  long  term  trend  is 
for  the  3  filter  STDM-IMM  to  give  a  smaller  rms  error 
than  the  version  with  2  filters. 

In  Fig.  5-G  we  compare  the  probability  of  error 
obtained  using  a  1  filler  DEA  estimator  versus  the  3 
filter  STDM-IMM  estimator.  Doth  curves  were  obtained 
from  a  filtering  operation  (see  (M91  Fig.  10.  N*0). 

We  can  see  that  the  present  estimator  gives  a  much 
clearer  indication  of  the  correct  system  structure 
and  hence  is  preferable  for  failure  detection. 

G.  Conclusion 

We  have  applied  the  recursive  slate  estimation 
algorithm  for  dynamic  systems,  whose  state  model 
experiences  jumps  according  to  a  sojourn-lime- 
dependent  Markov,  STDM,  chain,  to  the  problem  of 
failure  detection  The  algorithm,  which  is  of  the  IMM 
type,  uses  noisy  state  observations  and  the 
calculations  are  done  in  the  following  order: 

1.  Probability  pf  each  mode)  being  the  current 

model 

2.  Sojourn  time  pmf  in  the  current  model 

3.  Model-conditioned  state  vector  estimates  and 
covariances 

1.  Overall  state  vector  estimate  and  its 
covariance. 


The  first  example  simulated  Indicates  that  the 
use  of  the  STOM-based  IMM  estimator  can  give  a 
substantial  improvement  in  state  estimation  over  a 
Markov-based  IMM.  The  latter  relies  on  the  a  priori 
average  transition  probabilities  while  the  former 
uses  conditional  transition  probabilities  obtained 
from  the  conditional  sojourn  time  distribution.  This 
example  shows  that  the  STOM-based  scheme  Js 
substantially  better  than  the  Markov-basad.  scheme  In 
determining  the  true  system  model,  which  is 
beneficial  for  failure  detection  schemes. 

The  second  example  simulated  shows'  that,  for6  the 
particular  system  under  consideration  the  STOM-based 


1HM  estimator,  which  is;  an  hypothesis  merging 
technique,  compares  favorably  in  terms  of  the 
probability  of  error,  to  the  detection-estimation 
algorithm  based  estimator,  which  discards  the 
unlikely  parameter  history  hypothesis. 
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Distributed  Adaptive  Estimation  with 
Probabilistic  Data  Association* 

K.  C.  CHANGt  and  Y.  BAR-SHALOM|§ 

A  fusion  algorithm  for  target  state  estimation  under  cluttered  environment 
with  uncertain  measurement  origins  and  uncertain  system  models  in  a 
distributed  manner  can  be  applied  for  tracking  a  maneuvering  target  in  a 
cluttered  and  low  detection  environment 


Key  Words— Distributed  estimation;  multiple  model;  target  tracking;  probabilistic  data  association; 
Bayesian  methods;  distributed  sensor  networks. 


Abstract — The  probabilistic  data  association  filter  (PDAF) 
estimates  the  state  of  a  target  in  a  cluttered  environment. 
This  suboptimal  Bayesian  approach  assumes  that  the  exact 
target  and  measurement  models  are  known,  However,  in 
most  practical  applications,  there  are  difficulties  in  obtaining 
an  exact  mathematical  model  of  the  physical  process.  In  this 
paper,  the  problem  of  estimating  target  states  with  uncertain 
measurement  origins  and  uncertain  system  models  in  a 
distributed  manner  is  considered.  First,  a  scheme  is  described 
for  local  processing,  then  the  fusion  algorithm  which 
combines  the  local  processed  results  into  a  global  one  is 
derived.  The  algorithm  can  be  applied  for  tracking  a 
maneuvering  target  in  a  cluttered  and  low  detection 
environment  with  a  distributed  sensor  network. 

1.  INTRODUCTION 

The  major  difficulty  in  tracking  a  target  with 
switching  models/parameters  in  a  cluttered 
environment  is  due  to  the  fundamental  conflict 
between  the  operations  of  model/parameter 
identification  and  data  association,  since  the 
measurements  with  large  innovations  are  con¬ 
sidered  as  unlikely  to  have  originated  from  the 
target  of  interest.  In  this  paper,  a  multiple  model 
approach  in  conjunction  with  the  probabilistic 
data  association  (PDA)  filter  (Bar-Shalom  and 
Tse,  1975;  Bar-Shalom,  1978)  to  track  a  target 
with  switching  models  using  distributed  sensors, 
is  presented. 
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Several  approaches  have  been  proposed  to 
perform  the  state  estimation  of  a  system  together 
with  identification  of  each  model  (out  of  a  finite 
set)  in  a  centralized  framework.  One  of  the 
significant  schemes  is  the  so-called  generalized 
pseudo  Bayes  (GPB)  method  (Tugnait,  1982; 
Chang  and  Athans,  1978)  and  the  other  is  the 
interacting  multiple  model  (IMM)  algorithm 
(Blom,  1984;  Blom  and  Bar-Shalom,  1988).  The 
general  structure  of  these  algorithms  consists  of 
a  bank  of  filters  for  the  state  cooperating  with  a 
filter  for  the  parameters.  A  GPB  algorithm  of 
order  n  (GPBn)  needs  N"  filters  in  its  bank 
(Tugnait,  1982).  The  IMM  algorithm  performs 
nearly  as  well  as  the  GPB2  method  with  notably 
less  computation,  namely,  at  the  cost  of  GPB1 
(Blom  and  Bar-Shalom,  1988).  A  distributed 
estimation  scheme  with  uncertain  models  has 
also  been  derived  (Chang  and  Bar-Shalom, 
1987).  However,  in  all  the  above  approaches,  a 
perfect  data  association  was  assumed,  i.e.  there 
is  no  uncertainty  in  measurement  origins. 

To  take  into  account  the  data  association 
problem,  an  adaptive  PDA  algorithm  was 
presented  in  Gauvrit  (1984)  for  tracking  in  a 
cluttered  environment  with  unknown  noise 
statistics.  This  algorithm  identifies  on  line  the 
unknown  variances  of  the  process  and  measure¬ 
ment  noises  but  uses  an  earlier  (static)  multiple 
model  approach  (Bar-Shalom,  1988).  In  this 
paper,  a  distributed  estimation  problem  which 
takes  into  account  both  model  and  measurement 
origin  uncertainties  will  be  derived.  To  handle 
the  model  uncertainty,  a  more  general  formu¬ 
lation  with  dynamic  multiple  models  described  by 
Markovian  parameters  will  be  adopted.  These 
parameters  may  switch  within  a  finite  set  of 
values  which  represent  different  system  models. 
To  take  care  of  the  missing  and  false 
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measurements,  the  PDA  scheme  will  be 
employed.  The  probabilities  of  associating 
measurements  to  a  target  given  different  system 
models  will  be  computed  and  used  to  weight  the 
combination  of  state  estimates. 

The  problem  is  formulated  in  Section  2.  A 
centralized  algorithm  which  combines  the  IMM 
algorithm  and  the  PDA  filter,  resulting  in  the 
MMPDA  (multiple  model  PDA)  filter,  for  local 
processing  will  be  described  in  Section  3.*  Then 
the  fusion  algorithm  which  combines  the  local 
processed  results  from  multiple  sensors  into  a 
global  one  will  be  presented  in  Section  4. 

The  algorithm  can  be  applied  for  tracking  a 
maneuvering  target  in  a  cluttered  and  low 
detection  environment  with  a  distributed  sensor 
network  (DSN). 

2.  PROBLEM  FORMULATION 

Let  us  consider  the  two-node  scenario  similar 
to  that  given  in  Chang  et  al.  (1986),  where  each 
node  processes  the  local  measurements  from  its 
own  sensor  and  sends  the  local  estimates  to  the 
fusion  processor  periodically.  The  fusion  pro¬ 
cessor  then  sends  back  the  processed  results 
after  each  communication  time. 

The  dynamics  of  the  target  in  track  are 
modeled  as 

x(k)  =f[x(k  -  1),  M(k),  v[M(k),  k  -  1  j]  (1) 

where  x (k)  is  the  state  vector,  v[M(k),  k  -  1] 
the  process  noise  vector  and  M(k)  the  system 
model  from  time  k  -  1  to  k.  Assume  the  random 
model  process  M(k)  is  Markov  and  it  can  only 
take  values  from  a  finite  set  M,  which  contains  r 
distinct  models,!  i.e. 

M  =  {A#/}/-i.  (2) 

The  measurement  system  is  modeled  as  follows. 
If  the  measurement  originates  from  the  target  in 
track,  then 

z\k)  =  h‘[x(k),  M(k )]  +  w‘[M(k),  k )  (3) 

where  z‘{k)  is  the  measurement  vector  from 
sensor  i  and  wl[M(k),  k)  is  the  corresponding 
measurement  noise  vector.  The  two  noise 
sequences  are  mutually  independent  and  inde¬ 
pendent  of  the  initial  state. 


‘The  MMPDA  algorithm  has  been  implemented  in  (he 
interactive  software  MULTIDAT  (Bar-Shalom,  1987,  1988). 

tThe  models  can  have  states  of  different  dimension.  In 
this  case,  the  lower  dimension  state  vectors  are  augmented 
with  suitable  components  that  are  zero  w.p.l,  to  make  them 
compatible.  This  is  elaborated  on  in  Section  5. 

t  Such  a  rule,  also  called  “gating”,  considers  only  the 
measurements  within  some  distance  from  the  predicted 
measurements  (for  details,  see,  e.g.  Bar-Shalom  and 
Fortmann  (1988)). 


As  in  the  PDA  filter,  it  is  assumed  that  a  rule 
of  validation  of  the  candidate  measurements!  is 
available  such  that  it  guarantees  that  the  current 
return  will  be  retained  with  a  given  probability. 
For  each  sensor,  denote  the  validated  measure¬ 
ments  at  time  k  as 

Z'(fc)={*,«J>  00 

where  m'k  is  the  number  of  validated  measure¬ 
ments  of  sensor  i  at  time  k,  and 

Z‘^{Zl{l)}U  (5) 

The  local  model-conditioned  state  pdfs  at 
sensor  /  are 

p(x(k)\Mi(k),Zi>k,  Y‘-k), 

i  =  1,  2;  j  =  1, . . . ,  r  (6) 
with  the  corresponding  model  probabilities 

P{Mj{k)  |  Z‘-k,  Yi,k), 

/=1,2;  /  =  1 _ _  r  (7) 

where 

Y‘-k  =  (Y'(l) . Y'(*)}  (8) 

and  Y‘(k)  denotes  the  information  received  by 

node  /  during  the  sampling  period  ending  at  time 
k,  which  is  defined  as  the  fusion  result  (namely, 
global  conditional  pdf)  up  to  time  k-  1. 

Assuming  lossless  communication  and  that  the 
information  communicated  is  the  sufficient 

statistics,  i.e.  the  information  contained  in  Yi,k  is 
equivalent  to  the  information  in  Z‘,k~',  then  we 
have  the  following  equality: 

p{x{k)  |  Zu~\  Y‘-k)  =  p{x{k)  |  Z‘-k~\  Zu~l) 

=  p(x(k)\Zk~')  (9) 

where  i  represents  all  sensors  other  than  sensor  / 
and  Zk  =  {Z(/)}?„,,  where  Z(l)  represents 
measurements  from  all  sensors  at  time  /. 

Given  the  above  models,  the  question  now  is 
how  the  global  conditional  pdf  can  be  con¬ 
structed  by  fusing  together  the  local  ones. 
Specifically,  we  shall  investigate  what  is  the 
necessary  and  sufficient  information  that  has  to 
be  transmitted  between  nodes.  The  derivations 
wilLbe  carried  out  for  arbitrary  pdfs;  however, 
the  simulations  assume  linear  models  with 
Gaussian  random  variables,  in  which  case  the 
state’s  model-conditioned  pdf  (6)  is  Gaussian 
and  the  overall  conditional  pdf  of  the  state  is  a 
Gaussian  mixture  (Bar-Shalom,  1988). 

3.  CENTRALIZED  ALGORITHM  FOR  LOCAL 
PROCESSING 

For  each  local  node,  the  centralized  algor¬ 
ithm  where  all  measurements  are  sent  to  and 
processed  with  one  processor  is  described  below. 
The  goal  is  to  compute  the  conditional  state 
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distribution  given  the  local  accumulated  measure¬ 
ments.  With  only  model  uncertainty,  the  local 
conditional  pdf  at  sensor  i  can  be  obtained  as 

p(x(k)  |  Z‘‘k,  Y‘-k) 

~  2  p(x(k)  I  Zu,  Y‘-k) 

/-i 

x  P{M,(k)  |  Zhk,  Ylk).  (10) 


When  the  additional  measurement  origin  uncer¬ 
tainties  are  present,  the  above  equation  becomes 


p(x(k)  |  z‘-k,  Y‘-k) 

=  2\2p(xW\M'(k),  e\„  z‘-k,  Y‘-k) 

1  *■  01/ 

x  P{6[  I  Mjik),  Zf-‘,  J**}j 

x  P{Mj(k)  |  Z'-k,  (11) 

where  6[  is  the  event  that  z',(k)  is  the  correct 
measurement  and  0{,  denotes  no  correct 
measurement. 

The  first  term  on  the  right-hand  side  of 
equation  (11)  is  the  standard  PDA  filter  based 
on  model  Mf,  where  for  each  6[ 

z“,  r>) 


Mj(k),  Ql  Z‘-k-\  Y'-k) 
xp(x(k)  |  Mj(k),  Z‘-k~\  Y‘-k)  (12) 


where  0},  has  been  omitted  in  the  last  term  above 
(since  it  is  irrelevant)  and 

4 W)> 

=  jp(Z‘(k)  I  x(k),  M,(k),  e\,  zl,k~\  Y‘-k) 

x  p(x(k)  I  M,ik)i  ZKk~\  Yl,k)  dx(k) 

- P(Z'(k) \  0jlt  Zl,k~\  Yi,k).  (13) 


Using  Bayes’  rule,  the  second  term  on  the 
right-hand  side  of  equ£tion  (U)  is 

P{ei\Mj(k),  Zu‘  Yhk) 
p(Z‘(k )  |  Z‘,k~\  Yi-k)P{Qlh  |  M,(k), 

Zw“\  Yi,k}p(Mj(k),  Z‘-k~\  Y‘-k) 

~  p(Zl(k)\M,(k),  Z«-s  ^  ~ 

xp(M,(k),  Z',k~\  Yi,k) 

xP{0il\MJ(k),Zl-k-\Yl‘k} 

x  P{Gl,,\  M,(k),  Zj’k~\  Yl-k)  (14) 


where 

c»)i  =  SciW(fc),  ejj 

x  P{0j,  |  Mj(k),  Z‘-k-\  Ykk) 

=  p(Z‘(k)  \  \tj{k),  Zl,k“l,  Yl,k).  (15) 

In  equation  (13),  the  joint  measurement  density 
is  (see,  e.g.  Bar-Shaiom  (1988)) 

p{Z‘(k)  |  M,{k),  Q[,  Zkk~\  Y'-k) 

~  fi  p(z‘i(k)  |  M/(k),  0[,  Z‘-k-\  Ykk) 

i- 1 

=  if  /l  =  °  (16) 
1  Vkmi+,p[z[(k)  |  Mj{k)\  otherwise 

where  Vk  is  the  volume  of  the  validation  region, 
because  our  assumption  on  the  incorrect 
measurements  being  uniformly  distributed,* 
independent  from  each  other  and  from  the 
correct  measurement,  and 

p[z[(k)  |  M/(k)) 

=  P-Glp(zm\Mi(k),dill,Zi-k-\Yl-k)  (17) 


is  the  truncated  density  which  is  zero  outside  the 
validation  region  where  PG  is  the  probability  that 
the  correct  return  will  lie  in  the  validation 
region. 

In  equation  (14),  P{6‘tl\  M,{k),  Zuk~\  Y‘-k)  is 
the  prior  probability  of  the  event  0}(  based  on 
model  Mj  to  be  correct  at  time  k.  By  choosing  a 
large  enough  validation  threshold,  this  prob¬ 
ability  becomes  independent  of  Mj(k)  and  is 
assumed  to  be  the  same  for  all  0},  unless  target 
signature  information  can  be  used.  If  no  such 
information  is  available,  then 


PidilMjik),  Z'-*-1,  Yl-k) 

1  ~PCPd 
PGPd 


mk 


if  h  =  0 
otherwise 


(18) 


where  PD  is  the  probability  that  the  correct 
return  will  be  detected. 

For  each  model  Mf(k)  and  event-  0},,  equation 
(12)  is  the  standard  filtering  equation.  In  that 
equation,  by  using  the  IMM  approach  (Blom 
and  Bar-Shaiom,  1988),  the  extrapolated  pdf  is 
obtained  by  combining  the  extrapolations  of  the 


•  For  more  elaborate  models  see  Bar-Shaiom  (1988). 
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prior  pdfs  (independent  of  the  event  8)) 

P(x(k)  |  Mj(k),  Z‘,k~\  Y‘,k) 

=  t  p(x(k )  |  M,{k),  M,(k  -  1),  Yl,k) 

i-i 

x  P{M,(k  - 1)  |  Mj(k),  Z,§k~\  Y'1*} 

£  p(x(k)  |  Mj(k),  M,(k  -  1),  Z‘-k~\  Y‘‘k ) 

/-I 

= _ x  P{Mj(k),  M,(k  -  1)  |  Z‘-k-\  Yu) 

P{M,(k)  |  z‘-k-',  V-*} 

M,(k  -  1),  Zuk~\  YU:) 
<P{Mi(k)\M,(k-l)} 
x  P{M,(k  -  1)  |  K'-*>]  (19) 

where  p(x(k)  \  Mj(k),  M,(k  -  1),  Z‘‘k~\  Y‘-k)  is 
the  extrapolation  of  the  conditional  state  pdf 
given  Zu-1  and  Y‘,k  from  model  M,(k-  1)  to 
model  M/(k)  and 

ci,[Mi{k)]  =  P{Mi{k)\Zi'k-\  Yi%k} 

=  £p{m)  |A#,(*-1)} 

/-i 

xP{M,(k-l)\Z‘-k-',Yi’k}.  (20) 

The  last  term  of  equation  (11)  is  the  a 
posteriori  model  probability,  which  is  obtained 
as 

P{Mj(k)  | Z'-*-\  Yi,k} 

=  ^p(Z'(/c)|My(fc),  Zi,k~\  Yik) 

C4 

xP{M/{k)  |  Z‘‘k~\  Y,lk} 

=  (21) 

C4 


Fig.  1.  Centralized  MMPDA  algorithm  with  r  =  2  at 
sensor  L 


where 

c'^tcilMjikMM^k)} 

/-i 

= p(Zl(k)  |  Z!,k~x,  Yl,k)  (22) 

and  4[Aff(/:)]  and  c‘3[Mj(k)]  have  been  obtained 
in  equations  (15)  and  (20),  respectively. 
Equations  (12)— (21)  complete  a  recursive  cycle 
of  the  local  processing.  A  flow  diagram  of  the 
local  MMPDA  algorithm  is  given  in  Fig.  1.  The 
flow  of  data  is  represented  by  the  model- 
conditioned  means  xs  and  the  model  prob¬ 
abilities  Pj. 

4.  FUSION  ALGORITHM 

With  the  local  conditional  pdfs  obtained  in 
Section  3,  we  can  now  derive  the  fusion 
algorithm  to  obtain  global  pdf.  Similar  to 
equations  (10)  and  (11),  the  global  conditional 
pdf  can  be  obtained  as 


p(x(k)\Zk) 

=  £p(x(k)\Mi(k),  Zk)P{M^k)  I  Z*} 


x  p{d\,  e\  |  zk}}p{M,(k)  |  zk). 

(23) 

Assuming  measurements  from  different  sensors 
are  independent  given  the  target  state,  then  the 
first  term  on  the  right-hand  side  of  equation  (23) 
can  be  obtained  as 


p(x(k)  I  Mj(k),  el  el  Zk) 

_ i 

~  c[M){k),  e}„  el I 

xp(z(k)  \x(k),  Mj(k),  el  el  zk~l) 
xp(x(k)  I  el  el  Z*"1) 

_ i 

~c[Mj(k),  el  el) 

x  II  [P(Zl(k)  \x(k),  Mj(k),  elzk-1)} 

/- 1 

xpixWlM^k),  Zk~l) 

_ 1 

~c[M,(k),  el  el) 

h[P{z\k)\x{k),  Mt(k),  e[,  Zk~l) 
xp{x{k)\Ml{k)>Zk -■)] 
p(x(k)\M,(k),Zk-') 

(24) 
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where 

c[M,{k),  e\,  e},} 

-  J P(Z(k)  |  x(k),  Mj(k),  e\,el,zk-1) 

xp{x{k)\M){k),  el,GlZk-')dx(k) 
^pizwiM^eldlz"-')  (25) 

is  the  normalization  constant. 

Since  from  equations  (12)  and  (9) 

p(x(k)  |  M)(k),  81  Z‘-k,  K'-*) 

1 

~c‘[Mj(k),  d‘tl] 

XP{Z‘(k)  |  x(k),  Mj{k),  e[,  Zk~l) 
Xp(x(k)\M}(k),  Z*-').  (26) 

Equation  (24)  can  be  rewritten  as 
P(x(k)\Mj(k),  e\,elzk) 

i 

cmk),  e\,  0?2] 

nr  c^k),  e\] 


i-i 


x  P(x(k)  |  Mj(k),  ej,  Zl,k,  Yi,k)\ 
p(x(k)\Mt(k),  Z*-1) 


Co [m)>  V,,  Vj 

fl  p(x(k)\M,(k),  el  Z‘-k,Y‘-k) 


(27) 


p(x(k)\Mj(k),Zk-') 
where  the  denominator  can  be  derived  as 

p(x(k)  ( Mj{k),  Zk~')  =  Mi(k)  \  Zk~l) 

1  A  ’  p(Mj(k)  |  Z*"1) 

tpixWlMjW.M'ik-l),  Zk~l) 

_  *  P{M,(k)  I  M,(k  -  l)}P{M,(k  -  1)  |  Z*"1} 

t  P{Mj(k)  |  M,(k  -  1  )}P{M,(k  -  1)  |  Z*'1} 

(28) 


and 


C0[M,(k),  dl  g?L| 


A  c[[Mj(k),  e‘,\ 


r 

-- 


np(x(k)\Mj(k),  e[,  Zl,k,  Y‘,k) 


p{x{k)\M,(k),  Zk~l) 


■dx(k) 


(29) 


Assuming  0),  and  8\  are  independent  given 
the  target  state,  then  similarly  to  Chang  et  al. 
(1986),  the  second  term  of  equation  (23)  can  be 
obtained  as 

el\M,(k),  zk) 

! p(9}" o?2’  m\x{k)’ 
Zk~')p(x(k)  |  Mj{k),  Zk-')dx(k) 

1 

~c,[A4(fc)) 

hp(x(k),  eiz‘(k)\M,{k),zk-') 

XJ  pixikJlMjik),^-')  ^k) 

hp(x(k)\M,(k),  el  z!(k),  zk~l) 

XJ  p(x(k)m(k),zk-)  ck(/c) 

(30) 


where 


cx[Mi(k)]=p(Z(k)\Mi(k),Zk-')  (31) 

* 

and 


c,Wi{k))  = 


CxWiik)  1 

n  p(Z‘(k)  I  Mj(k),  Zk~x) 

i-1 


.  c,[A/,(£)] 

A  cilMiik)} 

1*1 


(32) 


are  normalization  constants,  where  c4[A^(A:)] 
was  given  in  equation  (15). 

Since  the  information  contained  in  Zk~l  is  the 
same  as  that  in  {Z‘,k~l,  Yi,k)  (see  equation  (9) 
for  details),  equation  (30)  can  be  written  as 


P{8\,  e\  |  M,(k),  Zk) 

n  P{e\\M,(k),  z‘-k,  Y‘-k} 

/» 1 

c2[Mj(k)} 

hp(x{k)\M,{k),  el  zi,k,  Y‘-k) 

X 1  P(x(k)  |  Mj(k),  Z*"1) 

x  ‘  {Mj(k),  e},,  0?J.  (33) 


is  the  new  normalization  constant. 


From  equations  (27)  and  (33),  equation  (23)  can 
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be  written  as 

ft  [p(x(k)  I  M}(k),  el  z‘-k,  r-k) ) 

XLv,"  xP(ei,\Mi(k),Zi-k,Yl-k}]  ► 
p(x(k)\Ml(k),Zk-') 

X  P{Mj(k)  |  Zk).  (34) 

The  last  term  of  equation  (34)  is  the  global  a 
posteriori  model  probabilities.  With  equations 
(31)  and  (32)  we  have 

P{Mj{k)  |  Zk) 

~zP(Z(k)  |  Mj(k),  Zk~')P{Mj(k)  |  Zk~1} 
c 

^cMiWPMik)  \Zk~'} 

=  [c2lM;(fc)]  ft  |  Z*-'} 

mCJMM 

c 

ft  [ci[A/i(/:)lP{A//(/:)  |  Zk~'}\ 

/« 1 

X  P{M]{k)\Zk-') 
c2[M,{k)] 
c 

ft  [p(Z‘{k)\  M,(k),  Zi,k~\  Y‘-k) 

xP{Mj(k)  \Zk~'}\ 

_  c2[M/A:)] 
c 

h[P{M,{k)  |Z'(*)f  Zk~ ■} 

^ _ *  p(Z‘(k)  1  Z*~‘)j 

P{Mt(k)  |  Z*-'} 

c2mk)} 

c 

h[P{M,{k)\Z‘-k,  Y‘-k}p{Z\k)\Zk-')\ 

/»  i 

“  P{M,(k)  )  Zk~{) 

c'  P{M,{k)\Zk-x)  (  J 


and  c'  are 


c=p(Z(k)\Zk-x) 


ft  p(z‘(k) |  zk~iy  hc‘4 

/-i  i-i 


4. 1.  Overview  of  the  fusion  algorithm 

From  the  above,  it  follows  that  the  global  a 
posteriori  pdf  and  model  probabilities  are 
obtained  by  combining  (multiplying)  the  local  a 
posteriori  pdfs  and  model  probabilities  and 
removing  (dividing)  the  common  a  priori  pdf  and 
model  ptobabilities.  From  equation  (34),  we  can 
scs  that  for  each  model,  the  conditional  global 
pdf  given  that  this  model  is  correct  is  obtained 
by  the  sum  of  global  fused  pdfs  given  all  possible 
global  event  pairs  0/,,  0?2.  The  overall  global  a 
posteriori  pdf  is  then  obtained  by  the  sum  of 
global  pdfs  of  each  model  weighted  by  the  global 
a  posteriori  model  probabilities.  Equations  (34) 
and  (35)  represent  the  complete  cycle  of  fusion 
processing.  From  them  it  follows  that  the 
information  needed  to  be  communicated  from 
local  nodes  to  the  fusion  node  consists  of: 

(a)  the  model  probabilities; 

(b)  the  association  event  probabilities;  and 

(c)  the  corresponding  pdfs  (mean  and  covari¬ 
ance  for  Gaussian  case). 

A  summary  flow  diagram  of  the  fusion 
algorithm  with  two  models  is  given  in  Fig.  2.  For 


tm _ l 
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where  the  denominator  is  the  same  as  that  of 
equation  (28)  and  the  normalization  constants  c 


Fig.  2.  Distributed  lviMPDA  algorithm  with  r  =  2. 
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simplicity,  only  the  mean  of  each  pdf  is  shown  in 
the  figure.  References  to  the  corresponding 
equations  are  also  given  in  the  figure. 

5.  SIMULATION  RESULTS 
A  two-dimensional  single  target  tracking 
problem  will  be  considered.  Two  target  dynamic 
models  will  be  assumed,  one  with  (nearly) 
constant  velocity  and  the  other  with  (nearly) 
constant  acceleration.  The  Markov  transition 
matrix  of  the  models  is  known  and  given.  The 
initial  target  state  estimate  is  given  and  the  initial 
probabilities  of  the  two  target  models  are 
assumed  equal. 

The  target  dynamic  models  with  discretization 
over  lime  intervals  of  length  T  are 
Jt(*)»F[Af(*))jc(*-l) 

+  G[M(k)]v(k  -  1)  (38) 
where  for  model  1,  the  nearly  constant  velocity 
model,  the  state  is 

*  =  [.x  x  y  yj'  (39) 


The  process  noise  v(k)  =  [vx,  vy]‘  representing 
the  acceleration  during  one  period  is  a  zero 
mean  Gaussian  white  noise  vector  with 


covariance 


For  models  2  (with  acceleration),  the  state  is 

.r  =  [jt  x  x  y  y  y)'  (42) 

and 

"l  T  T2/ 2  0  0  0 

0  1  TOO  0 

0  0  1  0  0  0  (43) 

0  0  0  IT  T2n 

0  0  0  0  1  T 

_0  0  0  00  I 


T2/2  0 

T  0 

1  0 

o  T2n 


where  the  process  noise  v(k)  representing  here 
the  acceleration  increment  over  one  period  is  a 
zero  mean  Gaussian  white  noise  vector  with 


covariance 


L  0  <72.J 


Assuming  only  position  measurements  to  be 
available,  then,  for  node  i 

z‘(k)  =  H‘x(k)  +  w,(k)  (45) 

where 

fl  0  0  0  0  01 

10  0  0  1  0  oJ  (46) 

and  w\k)  is  a  zero  mean  Gaussian  white  noise 
vector  with  covariance 


0] 

.0  rlX 


To  overcome  the  fact  that  one  has  different 
state  dimensions  the  lower  dimension  vector  was 
augmented  with  suitable  zero  components 
(which  then  have  mean  and  variance  zero)  to 
make  it  compatible  with  the  higher  dimension 
state. 

With  sampling  interval  T  =  1  s,  the  true  target 
is  simulated  with  constant  velocity  for  the  first 
seven  scans,  then  switches  to  constant  acceler¬ 
ation  for  the  next  seven  scans,  and  finally  returns 
to  constant  velocity  for  another  seven  scans.  The 
initial  target  state  is  assumed  to  be  [100  m, 
30ms-1,0, 100  m,  15ms"‘,0]  and  the  acceler¬ 
ation  is  assumed  to  be  5  and  -5  m  s~2  for  the  x 
and  y  coordinates,  respectively. 

The  variances  of  the  process  noise  are  taken  as 
qUx  =  qUy  =  0.1  (ms-2)2  for  model  1,  the  nearly 
constant  velocity  model,  and  q2.x~<h,y  = 
1.0 (ms-2)2  for  model  2,  the  nearly  constant 
acceleration  model.  The  detection  probabilities 
for  both  sensors  are  equal  to  0.67  and  the  false 
alarm  rates  are  0.0001  m-2.  The  standard 
deviations  of  the  measurement  errors  are 
assumed  to  be  V(10)  m  for  both  x  and  y 
coordinates  of  the  two  sensors.  The  Markov 
transition  matrix  for  the  model  parameters  is 
assumed  to  be 


’0.9  0.1] 
.0.1  0.9  J 


The  initial  state  estimate  is  generated  randomly 
with  mean  the  same  as  the  true  target  state 
and  covariance  matrix  equal  to 

diag  [100, 1,0.1,100,1,0.1]. 

Three  different  configurations  will  be  tested. 
First,  each  sensor  will  be  simulated  indepen¬ 
dently  using  the  MMPDA  algorithm  described  in 
Section  3.  Second,  a  centralized  processing  with 
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Trajectory  window 


FlO.  3.  Tracking  results  with  sensor  1  only  (one  sample  run). 


measurements  from  both  sensors  will  be 
simulated  using  the  same  MMPDA  algorithm. 
Finally,  the  distributed  case  will  be  simulated.  In 
this  case,  the  two  nodes  will  communicate  every 
scan.*  At  each  scan,  each  node  will  process  its 
own  sensor  measurements  first,  then  send  the 
local  processed  results  to  the  fusion  node.  After 
receiving  the  information  from  both  local  nodes, 
the  fusion  node  will  use  the  fusion  algorithm 
derived  in  the  previous  section  to  construct  the 
global  estimates  and  send  the  results  back  to 
_each_ local  node. 

Simulations  were  carried  out  with  50  Monte 
Carlo  runs.  The  results  of  one  sample  run  are 
shown  in  Figs  3-5.  Figures  3  and  4  show  the 
estimated  and  true  trajectories  of  the  target  with 
sensors  1  and  2,  respectively.  Figure  5  shows  the 
results  for  the  distributed  case  where  the  two 
sensors  interchanged  their  processed  results.  As 
one  can  see,  the  single  sensor  processed  results 
have  poor  performance,  arid  the  target  is  lost  in 
both  cases.  Figure  6  shows  the  probability 
trajectories  of  model  2  for  the  three  cases  as 
calculated  by  the  corresponding  state/model 
estimators.  As  can  oe  seen  from  the  figures,  in 

’This  is. totally  equivalent  to  the  centralized  configuration 
but  has  th,e  advantages  of  redundancy  and  reliability  for  a 
DSN  system.  This  configuration  can  also  be  used  with  a 
lower  communication  rate  (Chang  era/.,  1986). 


both  single  sensor  cases  the  algorithm  fails  to 
detect  clearly  the  switches  of  the  target  between 
two  models.  The  distributed  algorithm  not  only 
responds  faster  in  detecting  the  first  jump  of  the 
target  from  the  constant  velocity  mode  to  the 
constant  acceleration  mode,  but  also  successfully 
detects  the  end  of  the  acceleration.  The 
centralized  algorithm,  which  is  not  shown  in  the 
figures,  performs  exactly  the  same  as  the 
distributed  one. 

The  average  performances  for_  the _ three 

configurations  for  50  runs  are  given  in  Table  f. 
The  centralized  and  distributed  algorithms 
successfully  track  the  target  in  43  out  of  50  runs 
(“successful  tracking”  is  defined  when  the 
estimated  target  position  is  within  30  m  of  the 
true  target  position  for  the  last  three  scans). 
However,  out  of  50  runs,  sensor  1  alone  and 
sensor  2  alone  only  track  the  target  successfully 
in  27  and  30  runs,  respectively.  The  r.m.s. 
position  errors  for  those  successful  runs  are  also 
calculated.  Similarly,  the  centralized  and  distrib¬ 
uted  algorithms  perform  better  than  the  single 
sensor  configurations.  Note  that  the  quality  of 
the  estimation  using  two  sensors  in  terms  of 
mean  square  error  is  significantly  better  than 
using  a  single  sensor. 

The  centralized  case  yields  the  upper  bound  of 
the  performance  for  the  distributed  configur- 
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Fig.  4.  Tracking  results  with  sensor  2  only  (one  sample  run). 


FtG.  5.  Tracking  results  of  distributed  case  (one  sample  run). 
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Fig.  6.  Model  2  probability  trajectories. 


ation  when  the  nodes  communicate  every  scan. 
The  simulation  shows  that  the  results  of 
the  distributed  algorithm  are  the  same  as  in  the 
centralized  algorithm,  which  confirms  the  theor¬ 
etical  equivalence. 

6.  CONCLUSION 

A  recursive  estimation  algorithm  that  accounts 
for  the  uncertainties  of  both  measurement 
origins  and  system  models  in  a  distributed 
framework  has  been  derived.  The  distributed 
estimation  technique  has  been  adopted  together 
with  the  probabilistic  data  association  (PDA) 
filter  in  conjunction  with  the  interactive  multiple 
model  (IMM)  scheme.  The  resulting  algorithm 
can  be  applied  to  track  a  maneuvering  target  in  a 
cluttered  environment  v/ith  distributed  sensors. 
Simulation  results  show  the  expected  perform¬ 


ance  of  the  algorithm.  With  full  communication 
rate,  the  distributed  case  performs  exactly  the 
same  as  the  centralized  case,  which  confirms  the 
theoretical  equivalence,  but  has  the  advantages 
of  increased  reliability. 
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Decentralized  case 

Node  1  Node  2 

Centralized  case 

Distributed  case 
(full  rate 
communication) 

Number  of 
successful 
tracks 

27 

30 

43 

43 

r.m.s. 

Position 

error 

10.943 

9.368 

3.055 

3.055 
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Substituting  these  into  (3.10),  and  using  (E-2),  we  obtain 

tim«i(/)=0.  (3.13) 

The  estimation  property  (E-3),  the  uniform  boundedness  of  y(t)  and  u(f), 
and  (2.5)  the  definition  of  tj,  imply  that 

lim  e(f)«0. 

Substituting  this  into  (3.11)  and,  again,  using  (E-2)  we  obtain 

lim  e2(/)=0.  (3.14) 

/-•» 

Since  E(z~‘)  is  a  stable  polynomial,  we  can  establish  ii)  by  substituting 
(3.13)  and  (3.14)  into  (3.12).  VVV 

Remark  3.1:  The  multirate  sampling  estimation  algorithm  in  general 

docs  not  have  the  property  that  e(/)/(l  +  |jd>(f  -  l)||2ll/:  £  4,  which  is 
required  in  the  stability  proof  of  conventional  adaptive  control  algorithms. 
However,  we  still  prove  the  stability  using  property  (E-3)  and  the  relation 
je(r»)|  £  |e(f)|  for  tj.,  s  t  <  t,. 

IV-.  Conclusions 

In  this  note,  we  have  developed  a  multirate  sampling  adaptive  control 
algorithm  which  allows  a  fast  sampling  rate  of  feedback  control  to  be  used 
even  if  the  computation  of  parameter  estimate  and  controller  coefficient 
may  take  a  relatively  long  period  of  time. 

The  key  idea  to  achieve  this  is  to  record  the  plant  input  and  output  prior 
to  the  currently  obtained  estimate  and  use  them  to  compute  the  coming 
estimate  and  controller  coefficients.  Thus,  the  computation  is  not 
dependent  upon  the  inputs  and  outputs' appearing  during  the  updating 
process.  The  closed-loop  system  is  shown  to  be  stable. 

Remark  4.1: 

i)  One  may  further  extend  the  algorithm  to  consider  tj  -  f/_  (  >  n  +  m 
+  d  =  /f.  In  this  case,  a  relation 

|e((;_i  +  d+Ar)|sC|  max  |e(f)|  +  Cj 

(k  <  oo,  C|  <  oo,  Q  <  °°),  can  be  used,  and  the  algorithm  only  needs  to 
compute  e(t)  (octj.,  i  t  <  f/_(  +  ri  but  not  for  every  t  in  f;_i  s  t  <  tj. 

ii)  Instead  of  the  ARMA  model,  one  can  use  5-model  [8]  in  the 
algorithm,  which  retains  the  key  features  of  the  continuous-time  model 

.  and  allows  a  wide  bandwidth  MRAC  system  to  be  achieved. 

iii)  The  multirate  sampling  adaptive"  control  is  presented  for  an  indirect 
MRAC  system.  However,  the  method  covers  a  wide  class  of  direct  and 
indirect  adaptive  control  algorithms  of  certainty  equivalence  type  such  as 
pole-assignment,  LQ-optimai,  etc. 

iv)  Various  methods  developed  for  improving  adaptive  control  system 
performance  are  applicable  to  the  presented  multirate  sampling  adaptive 
algorithm.  These  methods  include:  a)  various  modifications  of  parameter 
estimator  for  improving  convergence  rate;  b)  noise  and  disturbance 
filtering  techniques;  c)  robustness  techniques  with  respect  to  disturbances 
and  unmodeled  dynamics,  such  as  deadzone,  normalization,  etc.;  d) 
internal  model  principle  for  deterministic  disturbance  rejection,  etc. 
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An  Adaptive  Dual  Controller  for  a  MIMO-ARMA 
System 

P.  MOOKERJEE  AND  Y.  BAR-SHALOM 

Abstract— An  adaptive  dual  controller  is  presented  here  for  a  multiin¬ 
put  multioutput  ARMA  system.  The  plant  has  constant  but  unknown 
parameters.  The  cautious  controller  with  a  one-step  horizon  and  a  new 
dual  controller  with  a  two-step  horizon  are  examined.  In  many  Instances, 
the  myopic  cautious  controller  Is  seen  to  (urn  off  and  converges  very 
slowly.  The  dual  controller  modifies  the  cautious  control  design  by 
numerator  and  denominator  correction  terms  which  depend  upon  the 
sensitivity  functions  of  the  expected  future  cost  and  avoids  the  turn-off 
and  slow  convergence.  Monte-Carlo  comparisons  based  on  parametric 
and  nonparamctrlc  statistical  analysis  Indicate  the  superiority  of  (he  dual 
controller  over  the  cautious  controller. 

I.  INTRODUCTION 

Multiinput  multioutput  systems  with  unknown  parameters  are  encoun¬ 
tered  in  many  practical  situations,  and  their  control  poses  a  great 
challenge  to  the  stochastic  control  theory.  It  is  not  possible  to  obtain  an 
optimal  solution  for  such  systems  because  of  the  dimensionality  involved 
in  the  stochastic  dynamic  programming  [6].  In  such  situations,  emphasis 
is  on  obtaining  a  suboptimal  solution  that  incorporates  the  intrinsic 
properties  of  the  optimal  solution.  For  stochastic  systems,  the  control  has 
in  general  a  dual  effect  [2],  [11].-  it  affects  the  system’s  state  as  well  as  the 
future  state  and/or  parameter  uncertainty.  Thus,  the  dual  controller  offers 
significant  improvement  potential  for  the  control  of  uncertain  linear 
plants.  In  multistage  problems  it  “probes”  the  system  to  enhance  real¬ 
time  identification  of  the  system’s  parameters  in  order  to  increase  the 
accuracy  of  the  subsequent  control  decisions  and  regulates  the  system  at 
the  same  time  [4],  (9J. 

Two  classes  of  dual  controllers  exist  presently  [14].  In  the  first  class 
[10],  [12],  [18],  the  control  minimizes  a  one-step  ahead  criterion 
augmented  by  a  second  term  which  penalizes  for  poor  identification.  This 
approach  is  simple  but  often  requires  tuning  of  some  parameters.  The 
second  class  (developed  for  SISO  systems  in  [3],  [16],  [17])  used  the 
stochastic  dynamic  programming  equation  and  expands  the  future  cost 
about  a  nominal  trajectory.  Using  first-  and  second-order  Taylor  series 
expansions  of  the  expected  future  cost  about  a  nominal  trajectory,  dual 
controllers  for  MIMO  static  systems  are  developed  in  [5]  and  [14],  A 
second-order  Taylor  series  expansion  of  the  firture  expected  cost  is 
performed  about  a  nominal  trajectory  and  a  dual  controller  based  on  a 
two-step  horizon  is  developed  in  this  note  for  a  MIMO  dynamic  (ARMA) 
model.  The  cautious  [14],  [16],  [18]  and  the  new  dual  controller  are 
applied  to  a  MIMO-ARMA  system.  Monte  Carlo  simulations,  based  on 
parametric  and  nonparametric  statistical  analysis,  indicate  that  the  dual 
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controller  prevents  the  turn-off  phenomenon  ami  slow  convergence 
prevalent  with  *  cautious  solution. 

Section-  H  gives --the  problem  formulation.  The  approximate  dual- 
controller  wiLh  a.  two-step  horizon  for  tise  MIMO  system  is  derived  in 
Section  ID.  The  control  solulion  is  obtained  by  approximating  the  solution 
of  the  stochastic  dynamic  programtiting'equatiqn.  A  second-order  Taylor 
series  expansion  of  the  expwhed'fiiture  cost  ii.performed  "shout  a  nominal 
trajectory  and  this  iesds  to  a  dual  control:  solution  in  a  closed  form. 
Following  die  derivations  of  die  controller,  a  summary  of  die  algorithm  is 
given.  Section  IV  describes  the  simulation  of  the  plant  and  compares  the 
performances  of  the  cautious  and  the  dual-solutions.  Section.V.concludcs 
die  note. 

II.  Problem  formulation 
The  MIMO  system  to  be  controlled  is  described  by 

y(k)=-Ay(k-\)+Bu{k-\)+e(k)  (I) 

where 

£[e(*)]=0;  £[e(*)e'(j))=K'5y.  .  (2) 

Here  y(,k)  is  the  output  of  the  plant,  u(k)  is  the  input  .to  the  plant,  and 
e(k)  is  the  measurement  noise. 

-The  parameter  matrices  A  and  B  are  unknown.  This  model  describes 
some  industrial  processes  like  an  ore  crushing  plant,  or  a  heat  exchanger 
[1],  The  unknown  elements  of  A  and  B  comprise  the  parameter  vector 
6(k)  whose  estimate  at  time  k  is  9(k)  with  covariance  matrix  P{k).  The 
parameter  vector  is  designated  as 

m  S  [a[\b[\a^\b'\ (3) 

where  n  is  the  dimension  of  the  output  vector  y{k)  and  <7/ ,  b't  are  the  tth 
row  of  the  matrices  A  and  B,  respectively.  Assuming  the  parameters  are 
time-invariant,  we  have 

0(*+l)  =  0(*).  (4) 

A  measurement  matrix  H(k)  is  defined  as 

W)  3  diag  [-y'(*)| «'(*),  -y'(k)\u'(k).  ■■■]  (5) 

where  H(k)  has  n  rows,  and  y'(k),  u'{k)  are  the  measurement  and 

control  vectors  transposed. 

With  these  definitions,  the  measurement  model  is 

y(k)=mk-.l)e(k-l)+e(k).  (6) 

The  performance  criterion  to  be  minimized  is  7(0),  i.e. ,  the  conditional 
expected  value  of  the  cost  C( 0)  from  step  0  to  N,  denoted  by 

7(0)=£{C(0)|/*} 

S  {Ak+i)-yr)'QW{y^+i)-y,}\n  0) 

*-# 

where  Q(k)  is  the  diagonal  weighting  matrix,  /*  is  the  cumulated 
information  at  time  k,  and  y,  is  the  desired  output. 

m.  DUAL-CONTROL  WITH  A  TWO-STEP  HORIZON 

First  the  controller  is  derived  and  then  a  summary  of  the  algorithm  is 
provided. 

A  dual  control  solution  with  a  two-step  horizon  is  obtained  by 
minimizing  (2.7)  with  respect  to  the  control  u(0)  for  the  multidimensional 
plant  (2.1)-(2.4).  This  is  obtained  by  solving  the  general  equation  of 
stochastic  dynamic  programming  [3],  [7],  [8] 

7*(*)  =  min£{C(*)+/*(*+t)|/‘}  kmN-l,  1,0  (1) 

»(*) 

where  J*(k)  is  the  optimal  expected  cost  to  go  from  k  to  N,  C(k)  is  the 


cost  to  go  from  k  to  Ar,  and  /*  is  the  cumulated  information  at  time  k  when 
-the  control  ti(k)  is  to  be  applied.  The  information  /*  is  the  set  of  all  past 
controls  until  time  k- 1  and  outputs  until  time  k. 

Thus,  for  a  two-step  horizon  we  have 

-^n£({^+I)-y-,)'Q(A:){y(Ar+l)-yf}+7;+u+2|/*] 

(2) 

where  7J+I  is  the  optimal  expected  cost  at  the  last  step  with  one-step 
horizon  and  is  obtained  by  minimization  of  Jk+uk+h  and  •/*♦!. *,2  is  the 
cost  to  go  from  k  +  1  to  k  +  2. 

The  cautious  control  at  k  +  1  with  one-step  horizon  is  given  by 

•  E[D'Q(k+  l){/ly(/:+  l)+y,}|f‘*']-  (3) 

The  cost  from  step  k  +  1  to  k  +  2  is 

4,i.H2='rW+DW' 

+£[{Ay(*+  l)+y,}  '(?(*+  I){Ay(l:+  l)+y,} 

+  u’(k+ 1  )B'Q{k+  l)£u(*+  l)-2{Ay(k+  l)+y,} ' 

•  2(Ar+l)flu(Ar+ 1)|/**']  -(4) 

and  inserting  (3)  into  (4)  the  optimal  cost  at  the  last  step  is 

-Wz=‘r<2(*+«>W' 

+£[{/ty(*+ 1  )+y,}  W+ 1  ){Ay(k+  l)+y,}|/*’‘] 
-E[{Ay(k+\)+yrYQ(k+\)B\I*") 

•  [£{£'<2(*+l)fl|/*+'}]'' 

•  E[B’Q(k+  \){Ay(k+  l)+y,}|/‘*']  (5) 

where  £{-|/**'}  is  the  conditional  expectation  given  the.  available 
information  Ik*1. 

The  unknown  parameters  will  be  chosen  from  the  Gaussian  family  and 
thus  their  estimate  d(k  +  1)  and  associated  error  covariance  P(k  +  1)  arc 
the  sufficient  statistic.  The  parameter  vector  estimate  0(lr  +  1)  and  the 
associated  covariance  matrix  P(k  +  1)  are  obtained  from  a  Kalman  filter 
according  to 

K(k+  l)=P(k)H'(k)[H(k)P(k)H'(k)+  If']-'  (6) 

5(k+  l)=d(k)  +  K(k+  l)[y(*+  l)-H(k)d(k)] 

=  d(k)  +  K(k+  1M*+1)  (7) 

P(k+  \)  =  P{k)~P(k)H'(k)[H{k)P{k)H’(k)+  W\-'H(k)P(k).  (8) 

Here  v(k  +  ))  is  the  innovation  of  the  process. 

From  (5)  it  is  clear  that  7£+ ,  t+ 2  is  a  nonlinear  function  of  the  estimated 
parameter  vector  d(k  +  1)  and  covariance  P(k  +  1).  But  the  estimated 
vector  S(k  +  I)  and  the  covariance  P(k  +  1)  are  not  known  until  the 
control  u(k)  is  applied. 

A  control  u(k)  with  a  two-step  horizon  can  be  obtained  from  (2)  if  a 
second-order  Taylor  series  expansion  of  J**uk+i  is  performed  about  a 
suitable  nominal  trajectory.  Here  the  nominal  trajectory  is  defined  by 

1)  a  nominal  parameter  estimate  S(k+  l)=9(Ar) 

2)  *  nominal  control  C(k) 

3)  a  nominal  covariance  P(k+ 1)  obtained  by  using  tl(k) 

4)  a  nominal  measurement  f(k  + 1)  obtained  by  using  Q(k)  and 

Hk),  i.e.,y(Ar+l)-#(*)0(*). 
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[  Expansion  of  (5)  about  this  nominal  trajectory  results  in 
■  i,k+i9‘J'+J,,Ve  +  «)[>(*+  !)-/<*+ 1)] 

+5  ly(k+  l)-y(k+ 1)1  'J„(k+  l)[y(k+  l)-j >(k+ 1)) 
+•/,'(*+  l)[Hk+  l)-9(Ar)]+i  0(k+  l)-ti(k))' 


with  the  superscript  here  denoting  the  matrix  element,  e,  the  ith  Cartesian 
basis  vector,  and 


P»(k+  I) 


a  a^(Ar+l). 
“  Bu(k)  ' 


/•*-(*+!) -A 


d'PHk+l) 

dul{k) 


hjm  1. 


r 


(20) 


«[»(*+ !)-«(*» 

-Mr(JJ>(*+t){P(it+l)-/>(*+l)}l  (9) 

where  J\  is  the  zeroth-order  term  and  the  cost  sensitivities  are 


,"ik* 1,4  [to(*+i)W‘n-i>] 

-  [wrir] 


/«(*+ 


’  I  <#,(*+ l)39,(k+l)  J 

r>w] 

+i)j  • 


Mk+ 1)  k 


[_3F"(*h 


(10) 


00 


(12) 


03) 


(14) 


evaluated  at  P(k  +  1)  and  u(k)  and  r  the  number  of  unknown  parameters. 

Now  a  (suboptimal)  dual  solution  i/0(k)  with  a  two-step  horizon  can  be 
obtained  from  (2)  using  (18)-(20)  and  is  given  in  closed  form  by 

uD(k)=lE{B'Q(k)B\n+F]-'[E{B'Q{k)(Ay(k)+yr)\Ii)  +/J  (21) 

where  the  elements  of  the  matrix  Fand  those  of  the  vector/arc  given  by 


F'-laV'  [■ 


I 


Jp(k+ 1  )~2^"(k+ 


1 

+  2,f 


*2* 


nl 

’ j  9ut(k)duj(k)  J 

wwVl 


j„(k+ 1) 


\du,(k ) 


9uj(k) 


Hk) 


.1 


(22) 


and 

/;= 


i  (  mk) . 


The  above  sensitivities  are  evaluated  at  3(k),  P(k  +  1),  and  j?(k  +  1); 
rnd  Pu(k  +  1)  is  the  ijlh  element  of  the  covariance  matrix  associated 
with  the  parameter  estimates  §i(k  +  1)  and  5j(k  +  1). 

Under  the  Gaussian  assumption  for  the  zero  mean  noise 

y(k+  !)-/(*+  n  00 

where  the  conditional  mean  is 

:p=£{mk)0(k)+e(k+V-fi(k)d(k)lIk} 

=  [//(*) -7?(k)]S(*)  (16) 

md  the  conditional  covariance  is 


2  \9u,(k) 


Hk)  }  J,{k+ 1) 


52“  [(«»t»-5«w+i)]  "'«> 


I  m 

■ 

L  i- 1  L 


tyk) 


(23) 


'V-Blirtk+n-M+iy-MXk+n-Mk+V-ityV'l 

=mk)P(k)H'(.k)+iV.  (17) 

With  the  choice  of  the  nominal  path  as  defined  earlier  and  using  (6), 
(16),  and  (17),  the  conditional  expected  value  of  (9)  is 

EVi^^n^i+j^k+mtm-mknHk) 

+5  >*'/„(*+ l)M+~»r  {/„(*+  on 

+itr  (/*(*+  D{F(*)-F<*+DJJ 

+  tr  [JP{k+  l){P(k+  1)—P(k+  |)JJ.  (IS) 

1  The  above  expected  .future  cost  (18)  is  a  function  of  the  nominal 
parameters  multiplied  by  appropriate  sensitivity  functions  J,(Jc  +  I), 
]„(k  +  1),  JM(k  +  1),  and  Jp(k  +  1).  These  sensitivities  introduce  the 
dual  effect  into  (2)  which  is  then  used  to  yield  u{k).  It  must  also  be  noted 
that  the  covariance  Plk  +  i)  Is  nonlinear  in  u(k)  and  is  not  yet  known. 
Hence,  a  second-order  expansion  of  P(k  +  1)  is  proposed  about  a 
nominal  control  ti(k)  and  a  nominal  covariance  P(k  +  !)  in  order  to 
obtain  a  (suboptimal)  dual  solution  uD(k)  in  a  dosed  form  from  (2). 

This  expansion  is  performed  as  follows: 

P(fc+l)=*P(fc+l)  +  ]2  e,ej  )  P<i{k+  l)(u(k)-U(k)] 

u  V. 

+i[u(k)-d(k)]'P^(k+i)[«(k)-a(k)]]  (19) . 


and  m  is  the  dimension  of  the  control  vector,  «/  is  the  ith  element  of  the 
control  vector. 

It  is  clear  from  (21)  that  this  approximate  dual  solution  uD(k)  is  a 
modification  of  the  cautious  solution  by  the  cost  sensitivity  terms.  The 
cautious  solution  is  (21)  with  F  =  0  and/  =  0.  These  account  for  the  dual 
effect.  The  implementation  of  this  second-order  dual  solution  is  per-  • 
formed  by  the  method  described  below. 

Algorithm  Summary: 

1)  Compute  the  sensitivity  functions  J*{k  +  1),  Jp(k  +  1),  J,(k  + 
I),  J„(k  +  1)  for  (18)  with  S(k  +  l)  =  Hk)  and  the  nominal  values 
Hk),  P{k  +  1),  y(k  +  1)  defining  the  nominal  path. 

2)  Search  on  (2)  with  (18)  (with  the  sensitivity  functions  computed 
above,  starting  with  first  nominal  values  i l(k),  P(Jc  +  1)]  over  u(k)  to 
obtain  an  improved  nominal  for  which  J*§kt  is  lower.  This  search  is 
done  by  selecting  a  first  coarse  grid.  A  grid  search  is  necessary  to  avoid 
locking  in  on  a  local  minimum.  Then  another  grid  is  chosen  about  the 
latter  control  over  a  narrower  interval  and  from  a  second  search  u‘(k)  is 
obtained. 

3)  Using  u'{k)  compute  (lie  covariance  sensitivities  P„(k  +  1),  Puu(k 
+  1);  together  with  the  previously  computed  cost  sensitivities  /#(£  +  1), 
Jp(k  +  1),  J,,(k  +  1),  J,(k  +  1)  obtain  F,  /  defined  in  (22),  (23). 
Finally,  the  control  to  be  applied,  Uo(k),  is.calculated  from  its  explicit 
expression  (21). 

The  iteration  described  in  step  2)  above  is  carried  out  to  obtain  better 
covariance  sensitivities.  The  control  Uo(k)  could  have  been  obtained 
directly  from  (21)  by  skipping  step  2)  above;  however,  as  indicated  in  (13] 
and  (14J,  this  results  in  unsatisfactory  performance.  With  this  iteration  of 
step  2),  the  “improved”  sensitivities  yield  good  performance  as  shown  in 
the  next  section. 
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IV.  SIMULATION  RESULTS 

Performance  is  evaluated  from  500  Monte  Carlo  runs  for  the  following 
controllers: 

1)  heuristic  certainty  equivalence  [3]  (with  a  one-step  horizon); 

2)  one-step  ahead  cautious  controller;  and 

3)  dual  controller  based  upon  sensitivity  functions  (with  a  two-step 
horizon)  derived  in  Section  ID. 

The  plant  equations  for  a  two-input  two-output  system  are 

y,  (*+ 1)  =  -  0,i.y, (k) - a^y^k) +b„u, (k) + bi2u2(k)+e, (k+ 1)  (I) 
Mk+  0=  -a22yl(k)-a21y2(k)+b2,u,(k)+l)22u2(k)+ei(k+ 1)  (2) 

',VherC  E{e(k)e'(j))  =  mkJ= diag  (IV..  W2)-. 

.K',=7.521;  ff'I=431.  (3) 

The  true  values  of  the  parameters  are 


0n  =  O.8 

b„=  -74.84 

0,1 =0.1 

hn=  -51.04 

0n  =  O.2 

6i,=  53.31 

0a=O.75 

6u=  -82.56. 

Only  the  gain  parameters  ( B  matrix)  are  considered  unknown  for 
testing  the  dual  effect  and  their  initial  estimates  were  generated  as  91  (by, 
bl),  i,  j  =  1,2.  This  choice  of  system  was  motivated  by  the  helicopter 
vibration  study  [13]. 

A  large  initial  uncertainty  is  chosen  in  the  parameter  estimates  in  order 
to  test  the  learning  capabilities  of  the  various  adaptive  algorithms.  The 
cost  weighting  matrices  are 

g(Ar)=diag(?„  <jr2);  <7,  =  1.0,  ft  =1.0.  (5) 

The  desired  response  is 

y,=  [-  18  80]'.  (6) 

For  the  model  chosen  (l)-(6)  the  optimal  control  solution  in  order  to 
reach  a  steady-state  value  of  y,  in  (6)  is 

«»=1.0,  u*=-1.0.  (7) 

In  terms  of  the  notation  of  (1)  and  (2) 

S(k)  k  [on  flu  6n(ky  Sn(k)  Oji  an  Su(k)  6u(k)]'  (8) 


TABLE  I 

average  costs  for  the  three  algorithms  in  the  Simulation 

WITH  A  LIMITER  (|u,|  s:  2.0,  |«,|  S  2.0)  (500  MONTE  CARLO  RUNS). 
THE  SUPERIOR  RATE  OF  ADAPTATION  OF  THE  DUAL  ALGORITHM 
IS  DEMONSTRATED  HERB 


Tin# 

Stap 

HCE 

Caucloua 

Dual 

k 

- k 

k 

2k 

& 

1-1 

& 

l-l 

a* 

i 

14051 

14851 

3623 

3623 

6944 

6944 

2 

6241 

21092 

3961 

7584 

6722 

13666 

3 

3578 

24670 

3246 

10830 

4230 

17896 

4 

1616 

262S6 

2836 

13666 

1866 

19762 

> 

1354 

27640 

2505 

16171 

1492 

21254 

f, 

807 

28447 

2154 

18325 

953 

22207 

7 

593 

29040 

1921 

20246 

700 

22907 

6 

462 

29502 

1670 

21916 

582 

23489 

9 

397 

29899 

1623 

23539 

535 

24024 

10 

347 

30246 

1327 

24866 

385 

24409 

• 

• 

• 

* 

« 

• 

• 

• 

• 

• 

• 

• 

• 

4 

40 

77 

34444 

281 

43810 

89 

29178 

TABLE  II 

STATISTICAL  SIGNIFICANCE  TEST  FOR  COMPARISONS  OF  THE  CAUTIOUS 
AND  THE  DUAL  ALGORITHM  IN  THE  SIMULATION  WITH  A  LIMITER 
(|«,|  £  2.0,  |u2|  £  2.0)  (500  MONTE  CARLO  RUNS) 


Tla« 

Sc#p 

k 

T«ac 

Statistic 

EtClauCad 

I'torOYawant 

EIk» 

i 

-8.1 

-91 

2 

-5.3 

-69 

3 

-2.2 

-30 

4 

3.5 

34 

5 

3.3 

40 

6 

6.0 

56 

7 

6.3 

64 

•8 

6.5 

65 

9 

6.5 

67 

10 

5.7 

71 

11 

6.3  • 

76 

12 

5.6 

70 

13 

5,9 

82 

14 

5.2 

62 

15 

5.5 

79 

16 

4,9 

70 

17 

4.5 

78 

18 

4.4 

74 

19 

4,4 

76 

20 

4.3 

76 

and 


H(k)  k 


-y>(k)  «.(*)  u2(k)  0  0  0  0 

0  0  0  -y,(*)  -yi(k)  u,(k)  «2(*)J 


(9) 


The  controllers  are  implemented  with  a  sliding  horizon  for  a  total  of  40 
time  steps.  The  evaluation  criterion  is 

ct=(y(*+ 1  )-y,)'  G(*)(y<*+  D-y,).  OO) 

A.  Analysis  of  the  Monte  Carlo  Average  Costs 

Comparisons  are  made  between  the  performances  of  the  cautious  and 
the  dual  algorithm  on  the.system  and  a  statistical  significance  analysis  is 
done  using  the  normal  theory  approach  (i.e.,  it  is  assumed  that  the  central 
limit  theorem  holds  for  the  sample  mean  from  a  large  number  of  runs) 
[14].  Tables  I-IV  contain  the-. results  of  the  simulation  runs.  Table  I 
compares  the  average  cost  C*  over  500  Monte  Carlo  runs  for  the  first  40 
time  steps  for.HCE,  cautious  and  the  dual  algorithms,  with  a  control 
limiter  |t//|  £2 , /«  1,2/  ■ 

Clearly  it  is  seen  that  (he  cumulative  average  cost  is  the  lowest  for  the 
dual  controller.  The  HCE  incurs  an  excessive  penalty  in  time  step-1) 
because  of  lack  of  caution.  The  cautious  controller  is  overly  cautious  and 
exhibits  slow  convergence.  However,  the  dual  controller  incurs  less 
penalty  in  time  step  1)  than  the  HCE  and  makes  a  judicious  choice  of 


caution  and  probing  to  learn  the  parameters  fast.  Fig.  1  compares  the 
performances  of  the  three  algorithms  for  500  Monte  Carlo  runs.  Both 
Table  I  and  Fig.  1  demonstrate  the  superior  rate  of  adaptation  of  the 
dual  algorithm. 

Table  II  provides  a  statistical  significance  test  and  shows  the  improved 
performances  of  the  dual  solution  from  time  step  4)  onwards  with  at  least 
98  percent  confidence. 

Table  HI  indicates  the  percentage  of  runs  where  the  cost  exceeds  2000 
for  the  two  algorithms.  This  threshold  of  2000  is  selected  from  a  sample 
distribution  study  of  the  cost  at  each  time  step.  Table  IV  shows  the 
percentile  test  [14],  [15]  comparing  the  cautious  ahd  the  dual  solution. 
They  clearly,  indicate  from  time  step  4),onwards  the  light  tailed  nature  of 
the  distribution  of  the  cost  yielded  by  the  new  dual  control  algorithm. 

D.  Individual  Time  History  Runs 

Analysis  of  the  Monte  Carlo  average  cost  indicates  the  improvement 
offered  by  the  dual  solution;  It  provides  no  information  about  the  cautious 
control’s  turning-off  phenomenon  [16],  [18].  Hence,  a  careful  investiga¬ 
tion  of  the  individual  runs  is  required  to  examine  these  occurrences. 
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TABLE  III 

"  COMPARISON  OP  THE  TAILS  USING  THE  CAUTIOUS  AND  THE  DUAL 
ALGORITHMS  IN  THE  SIMULATION  WITH  A  LIMITER 
(|ui|  £  2.0,  |u,|  S  2.0)  (500  MONTE  CARLO  RUNS) 


TIm 

St*p 

Fice«nU|i  of  run* 
yhlch  *xc#*d  2000 

k 

Ctutloua 

Dual 

1 

36 

76 

2 

60 

52 

3 

43 

A0 

A 

33 

25 

s 

•  31 

17 

6 

22 

10 

7 

22 

6 

S 

19 

7 

9 

16 

I  3 

10 

12 

2 

11 

12 

1.2 

12 

10 

1.4 

n 

11 

1.4 

14 

7 

1 

15 

S 

0.  A 

16 

6 

0.A 

17 

6 

0.2 

13 

6 

0.  A 

19 

5 

0.  A 

20 

5 

0.2 

CAUTIOUS  AND  DUAL 


Fig.  2.  Time  history  of  output  I  using  the  cautious  and  the  dual  algorithms 
for  run  90  (500  Monte  Carlo  runs;  |«i|  £  2.0;  |u2|  £  2.0). 


.  -  _  TABLE  IV  ' 

PERCENTILE  TEST  FOR  COMPARISONS  OF  THE  CAUTIOUS  AND  THE  DUAL 
ALGORITHMS  IN  THE  SIMULATION  WITH  A  LIMITER 
(|u,|  £  2.0,  |«2|  £  2.0)  (500  MONTE  CARLO  RUNS) 
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Tia. 
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Sc.p 

X?  CMC  1C. cl. etc* 
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Vt 

1 
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•• 

3. 

•  • 

A 

10 

5 

19 

6 

23 

“ 

7 

32 

8 

35 

9 

57 

10 

37 

11 

•AO 

* 

12 

A0 

13 

40 

14 

16 

15 

32 

16 

11 

»* 

17 

16 

18 

16 

19 

18 

20 

25 

Tim*  Step 

Fig.  3.  Time  history  of  output  2  using  the  cautious  and  the  dual  algorithms 
for  run  90  (500  Monte  Carlo  runs;  |«,|  as  2.0;  |r/2|  £  2.0). 


CAUTIOUS,  DUAL  AND  HCE 


I 

Fig'.  1.  Time  history,  of  the  average  cost  using  the  heuristic  certainty 
;  equivalence,  cautious,  and  the  dual  controllers.  (500  Monte  Carlo  runs; 
'  l«t|.  £  2.0,  |u2|  £  2.0.)  Ths- superior  rate  of  adaptation  of  the  dual 
<  algorithm  is  demonstrated  here. 


CAUTIOUS  AND  DUAL 


Tin*  SC*P 

Fig.  4.  Time  history  of  control  1  using  the  cautious  and  the  dual  algorithms 
for  run  90  (500  Monte  Carlo  runs;  |W||  £  2.0;  |«j|  £  2.0). 
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CAUTIOUS-  AND' DUAL 


Fig.  5.  Time  history  of  control  2  using  the  csutious  and  the  dual  algorithms 
for  run  90  (500  Monte  Carle  runs;  |H||  :S  2.0;  |i/2|  :S  2.0). 

The  turn-off  phenomenon  is  observed  in  many  runs  among  the  500 
Monte  Carlo  simulations  while  usingthe  cautious  controller;  run  90  is  a- 
typical  example  of  it.  Both  components  are  almost  off  between  time  steps 
0  and  20  during  which  the  dual  controller  already  identified  the 
parameters  and  reached  the  desired  trajectory.  Figs.  2-5  portray  this 
result. 

V.  Conclusions 

A  new  adaptive  dual  control  solutjon'with  a  two-step  sliding  horizon 
has' been  developed  for  an  ARM^-MIMO  system.  The  control  law  is 
derived  by  solving  the  stochastic  dynamic  programming  equation.  This 
solution  utilizes  the  dual  effect  by  performing  a  second-order  Taylor 
series  expansion  of  the  expected  future  cost  and  does  not  need  any  tuning 
for  any  of  the  runs  in  the  example.  It  modifies  the  cautious  solution  by 
explicit  numerator  and  denominator'correction  terms.  The  controller  in  its 
present  form' is  the  first  of  its  kind  in  a  closed  form  for  a  system  with 
unknown  parameters.’  The  controller  is  tested  on  a  MI  MO  system  in  a 
systematic  Monte  Carlo  fashion.  Conclusions  arc  based  on  500  Monte 
Carlo  rims..  Analysis  of  the  simulation  runs  has  shown  that  this'  new  dual 
control  solution  applied  to  a  multiinpul  multioutput  mode I  improves. 


over  the  cautious  controller.  The  key  improvement  is  in  the  avoiding  of 
situations  like  turn-off  and  slow  convergences,  typical  of  the  cautious 
solution. 
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ABSTRACT 

Tha  ravaraion  in  tiaa  of  a  atochaatic  dlffaranca 
aquation  in  a  hybrid  space,  with  a  Markovian 
solution,  is  praaantad.  Tha  ravaraion  is  obtainad 
by  a  martingale  approach,  which  praviously  lad  to 
ravarsa  tine  fonts  for  stochastic  aquations  with 
Causs-Markov  or  diffusion  solutions.  Tha  ravarsa 
,tisa  aquations  follow  from  a  particular 
non-canonical  sartingala  decomposition,  vhlla  tha 
ravarsa  tiaa  aquations  for  Causs-Markov  and 
diffusion  solutions  followad  froa  tha  canonical 
martingale  dacoaposition.  Tha  naad  for  this 
non-canonical  dacoaposition  staas  froa  tha  hybrid 
stats  spaca  situation.  Koraovar,  tha  non-Caussian 
discrata  tiaa  situation  laads  to  ravarsa  tiaa 
equations  that  incorporata  a  Bayasian  estimation 
stap. 

1.  INTRODUCTION 

This  papar  adrassas  tha  problaa  of  tiaa-ravarsion 
of  a  hybrid  stata  Markov  procass  which  is  givan  as 
tha  solution  of  a  stochastic  dlffaranca  aquation. 
Tha  dasirad  rasult  is  a  siailar  aquation  but 
running  in  reversa-tlaa  dlraction  vhlla  having  a 
solution  that  is  raspactivaly  pathvisa  and  in 
probability  law  aquivalent  to  tha  solution  of  tha 
forward  aquation. 

Tha  Motivation  to  study  this  problaa  stass  froa 
two  diffarant  kinds  of  application.  Tha  first  is 
to  approach  tha  solution  of  a  nonlinaar  smoothing 
problaa  by  a  aarging  of  tha  astiaatas  of  two 
nonlinear  filters:  ona  flltar  aatchas  tha  original 
aodal  and  is  appilad  in  tha  usual  tiaa  dlraction 
whila  tha  othar  filtar  aatchas  tha  tiaa-ravarsad 
aodal  and  is  appliad  in  tha  ravarsa-tiaa 
dlraction.  Tha  aacond  application  is  tha 
dataralnation  of  a  rata  distortion  thaory  lovar 
bound  for  a  discrata-tlaa  nonlinaar  filtering 
problaa  by  tha  aathod  of  Galdos.  This  aathod  is 
basad  on  Bucy's  raprasantation  formula  and 
raquiras  a  Monta  Carlo  simulation  in  ravarsa-tiaa 
dlraction  of  aodal  matching  trajectories,  starting 
froa  a  prespecified  and  point  (Caldos,  1981; 
Waahburn  at  al.,  1915).  For  both  of  thasa  two 
applications  it  is  necessary  to  have  a 
tiaa-ravarsad  difference  aquation  for  which  tha 
Markovian  solutions  are  in  probability  law 
aquivalent  to  tha  original  solution. 
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Our  problaa  falls  in  tha  category  of  how  to 
ravarsa  a  Markov  process  in  tiaa.  Tha  Markov 
property  implies  that  tha  past  and  tha  future  are 
independent  under  tha  condition  that  tha  present 
stata  is  known  (Ventzell,  1981).  This  invariance 
with  respect  to  tha  time  dlraction  is  tha  key 
property  used  in  tiaa-ravarsion  studies.  Thera  are 
two  types  of  studies  that  deal  with  this  problaa; 
a  classical  type  and  a  aystaas-typa.  Tha  classical 
type  of  study  assuaas  that  the  transition  aaasura 
or  tha  generator  of  a  Markov  procass  is  givan  and 
than  tries  to  characterize  tha  transition  aaasura 
in  reverse-tine  dlraction  (Magasava,  1984;  Xunita 
and  Watanab a,  1986;  Chung  and  Halsh,  1969;  Azina, 
1973;  Hasegaun,  1976;  Dynkin,  1978;  Williams, 
1979). 

Tha  systeas-typa  of  study  assuaas  that  a' 
stochastic  aquation  with  a  Markovian  solution  is 
givan  for  which  it  tries  to  characterize  tha 
tiaa-ravarsad  aquation.  Tha  first  tiaa-ravarsad 
aquations  ware  obtainad  by  orthogonality 
arguments,  for  tha  linear  Gaussian  situation 
(Ljung  and  Kailath,  1976;  Lainiotls,  1976).  tor 
general  diffusions,  it  has  already  bean  pointed 
out  by  Stratonovich  (1960)  how  to  obtain  tha 
raversad-tlaa  aquations  by  actually  following  tha 
classical  approach:  froa  a  stochastic  aquation  via 
tha  generator  and  tha  tiaa-ravarsad  generator  back 
to  tiaa-ravarsad  aquations.  A  truly  systens-typa 
of  study  has  baan  started  by  Verghasa  and  Kailath 
(1979),  by  showing  how  for  a  linear  Gaussian 
system  a  sore  direct  martingale  approach  laads  in 
a  simpler  way  to  tiaa-ravarsad  aquations. 

Koraovar,  by  this  approach  it' was  possible  to 
obtain  a  reversed-tiae  aquation  with  a  pathvisa 
aquivalent  solution.  Early  elaborations  of  thasa 
ideas  lad,  along  different  routes,  to 
tiaa-ravarsad  equations  with  pathvisa  equivalent 
solutions  (Andeirson,  1983;  Castanon,  1983; 

Pardoux,  1983).  During  subsequent  studies,  quite 
large  classes  of  stochastic  differential  aquations 
and  their  reversed-tiae  aquations  have  bean 
identified  (Elliott  and  Anderson,  1985;  Pardoux, 
1985;  Elliott,  1986a,  1986b;  Kaussaann  and 
Pardoux,  1986;  Pardoux,  1986) .  Recently  these 
results  have  boan  extended  by  using  tha  Girsanov 
transformation  of  Brownian  notion  (Picard,  1986; 
Protter,  1987) .  Obviously,  this  Girsanov  approach 
can  not  be  appliad  to  discontinuous  or 
discretartlae  processes. 

To  give  an  Idas  of  why  there  is  an  additional 
problaa  in  using  a  martingale  approach  to  tha 
reversiSfi  ot  an  aquation  with  a  discontinuous 
solution,  ^a  give  a  brief  outline  of  tha  approach. 
The  cartingala  approach  roughly  consists  of 
checking  if  the  time-reversed  driving  noise 


sequence  can  b«  decomposed  in  a  auitabic 
reverse-tin*  martingale  part  and  ita  complement 
and  next,  if  cucH  *  dacoapoaition  exists  (Jaqod 
and  shiryaev,  1917;  Jacod  and  prottar,  19a«J7 
selecting  auch  a  dacoapoaition.  The  final  atap  ia 
to  characterize  both  the  nartingale  part  and  ita 
coaplanant.  In  contraat  with  a  continuoua  procaaa 
auch  a  dacoapoaition  la  not  unique  for  a 
diacontinuoua  procaaa  (aaa  for  example,  Jaijod  and 
Shiryaav,  1987).  Thla  Bakes  the  selection  of'-a 
suitable  nartingale  dacoapoaition  far  froa  trivial 
in  tha  hybrid  state  space  situation,  because  a 
leas  good  choice  yields  unnecessarily  conpllcated 
reverse-tine  aquations.  This  coaplication  ia 
presently  unsolved,  neither  in  continuoue-ti*e  nor 
in  discrete-tine.  It  will  be  solved  in  the  sequel 
for  quite  general  difference  equations  in  a  hybrid 
space.  With  that  result  ve  subsequently  reverse 
the  considered  difference  aquation  in  tiaa. 

The  paper  is  organized  as  follows.  In  section  2  we 
define  the'  hybrid  state  stochastic  difference 
equation  that  will  be  considered  and  shortly 
coapare  its  tina-rsversion  with  the  tine-reversion 
of  a  linear  Gaussian  aquation.  In  section  3  we 
specify  tha  tine-reversion  requireaents.  Next,  in 
sections  4  and  S  wa  consider,  respectively,  the 
pathvise  tine-reversion  and  the  in  probability  law 
equivalent  tine-reversion.  In  section  6  we  discuss 
the  results  obtained. 

2.  THE  STOCHASTIC  DIFFERENCE  EQUATION  CONSIDERED 

The  stochastic  difference  equation  we  consider  in 
the  sequel  is  the  following  systea,  on  an 
appropriate  stochastic  basis  and  a  discreta  tins 
interval  [0,T]  *  Hx(0,T],  T<», 

xt+i  “  •(*t+l'*t'xt»wt) *  (l*a) 

®t+l  “  b(»t,vt),  (l.h) 

yt  -  c(4t,xt,wt,ut),  (l.c) 

where  (wt),  (u^)  and  (vt)  are  i.i.d.  standard 
Gaussian  sequences  of  dinension  p,  q  and  1 
respectively,  the  Initial  distribution  of  (xq,«0) 

has  the  density  nass  function  p„  ,  and 

*0**0 

(vt,vt,uc)  is  independent  of  (Xg,t0).  Further  xt, 
and  y^  have  respectively  Rn-,  X-  and  R*-valued 
realizations  (with  X  a  countable  set),  while  a,  b 
and  c  are  neasurable  nappinge  of  appropriate 
dinensions  such  that  system  (1)  has  a  unique 
solution  for  each  initial  (Xq,«o)  with 

P„  .  (X0,»0)*0.  »«PPih9*  •*  b  *nd  c  are 

*0/  *0 

tine-invariant  for  notational  ainpliclty  only. 

The  second  order  dependence  of  (l.a)  on  (#t)  is 
quite  uncommon  (Blon,  1915).  Obviously,  (l.a) 
reduces  to  the  more  connon  situation  of  first 
order  dependence,  only  if  »(*t*l»#t'xt<wt)  i* 
invariant  w.r.t.  either  oc  ®t+l-  rt]* 
interpretation  of  (l.a)  as  an  equation  vith  a 
second  order  dependence  on  («t)  suggests  the 
substitution  of  in  (*•*)•  0n  doing 

this  (l.a)  reduces  to  the  nore  connon  equation, 
and  it  follows  immediately  that  (£t)  and  (lt,xt) 
are  Harkov  processes.  However,  as  the  state  space 


of  is  significantly  larger  than  tha  state  apace 
of  #t,  thla  is  a  rather  brute  force  transformation 
of  (l.a).  A  nore  elegant  transformation  of  (l.a) 
to  the  nore  connon  equation  conalats  of 
substituting  (l.b)  in  (l.a),  which  yields  an 
equation  of  the  following  for*, 

*t+l  “  *' (*t'xt»wt»vt) • 

Inatead  of  a  state  space  expansion,  there  appears 
an  additional  noise  term,  vt.  Fron  the  latter 
representation,  it  follows  immediately  that  the 
processes  («t,xt)  and  (St|  ars  Harkov  processes. 
The  latter  transformation  clearly  shows  that  (l.a) 
is  indeed  nore  general  than  the  nore  commonly 
studied  equation  with  first  order  dependence  of 
(«tj.  With  the  study  of  this  nore  general 
equation,  ws  also  anticipate  the  tine-reversion 
results  obtained.  In  the  sequel  it  will  turn  out 
that  a  reverse-tine  equation  of  (l.a)  has,  in 
general,  a  second  order  dependence  on  the 
tine-reversed  (»t>»  *v*n  wh,n  *(®t+l'*t*xt<wt> 
«t-invariant.  In  view  of  this,  it  is  natural  to 
study  the  above  nore  general  for*. 

In  the  sequel  we  consider  the  tlne-raversion  of 
systen  (1)  under  the  following  assumptions* 


a(»,a,.,w)  has  an  inverse  a*:X2xRnxRP»Rn,  such 
that  for  any  (»,a, w)eMJxRP, 

a*(#  ,*,*(»,  4,  x,w)',v)»x;  all  xeRn.  (2) 

A. 2 

b(.,v)  has  an  inverse  b*:XxR.X,  such  that  for  .any 
V6R, 

b*(b(» ,v) ,v)"» >  *11  ®6X-  (J) 

Assumptions  txX  *nd  fui  suggest  to  transform 
(l.a,b,c)  to  the  following  tine-reversed  model, 

*t“**(®t+l'®t'xt+l«wt)  > 

®  t"b* ( * t+1 *  vt) * 

yt“=(®t'>tt»wt*ut)  • 

Because  (vt,vt)  and  the  future  (-  reverse-tine 
past),  9t+1  -  #((y„x#,»,)j  *e(t+l,T)),  are 
dependent,  this  is  not  the  tine-reversed  systen  ve 
should  look  for.  Unfortunately,  it  is  not  clear 
how  to  continue  fron  here.  To  develop  sons 
insight,  we  take  a  quick  look  at  the 
time-reversion  of  a  linear  Gaussian  system. 

Llncar.caussian  example 

Consider  the  following  linear  Gaussian  systen 
xt+l  “  *Xt  +  Bwt* 

Assumption  AU.  inplies  that  A  is  invertible,  by 
which 

xt  -  A'l  (xt+1  -  Bvt). 

Obviously  vt  and  the  future  >t+l  *r*  dependent, 
which  require*  a  martingale  decomposition  of  w^. 

In  this  linear  Gaussian  case  the  canonical 
nartingale  decomposition  is  the  appropriate  on*. 

It  consist*  of  decomposing  vt  in  its  reverse-tin* 
predictable  part,  *(wt|*t+l>'  ifc*  complement 

WV  * 

,  wt  -  S(vt|»t+1)  +  w*t. 

The  problem  is  now  to  write  the  predictable  part 
as  a  function  of  x^+x  possible)  and  to 


characterize  the  covariance  of  w*t.  As  pointed 'out 
by  Vergheae  and  Kailath  (1979)  it  follows  readily 
from  orthogonality  arguments  that  .• 

«(wt|»t+l)  "  E{Wt|Xt+l)# 

while  the  fundamental  formula  for  LLSE  estimation 
yields 

E(Vtl*tU>  “  BTR"1(t+l)xt+1, 

Cov(W*t)  -  I-  BTR-*(t«)B,  \ 

where  R(tH)  is  the  covariance  of  xt,.x.  , 

By  a  straightforward  substitution  of  these  remits 
we  obtain 

xt  -  A-1  £xt+i  -  B  BTR_1(t+l)xt+l  -  Bw*t), 
which  yields  the  desired  reverse-time  system: 

2t  -  A-1  (*t+1  -  B  B7R_1(t+l)2t+x  -  B9t). 

The  orthogonality  arguments  and  the  LLSE 
estimation  step,  used  in  the  above  procedure, 
prevent  a  straightforward  extension  of  that 
procedure  to  equation  (1).  In  the  sequel  we 
replace  the  orthogonality  arguments  and  the  LLSE 
estimation  step  respectively  by  Markov  duality 
arguments  and  a  Bayesian  estimation  step.  Besides 
this,  we  have  to  select  an  appropriate  martingale 
decomposition.  Following  the  linear  Cauasian  case, 
the  canonical  martingale  decomposition  eeems  a 
good  candidate: 

(vt,vt)-<vt*,vtVE<  <vt.vt)l»t+l>  • 

Unfortunately,  this  decomposition  leads  to  vary 
complicated  elaborations  of  the  Bayesian 
estimation  step.  To  avoid  these  complications,  we 
use  in  this  paper  the  following  decomposition: 

(«t*.vt*>  "  <wt«vt)  ”  («t<*t)  » 
with:  vt  a  E(vt|Jt+1)  and 
wt  s  E(vt|Jtn,vt). 

The  main  step,  that  must  be  carried  out,  is  to 
prove  that  the  latter  is  a  martingale 
decomposition,  and  to  elaborate  on  the  Bayesian 
estimation  step,  ror  the  presentation  of  these 
results  a  constructive  approach  is  taken,  starting 
with  a  precise  description  of  the  time-reversion 
objectives. 

3.  TIKE-REVERSIOH  OBJECTIVES 

He  want  to  obtain  a  time-reversed  version  of 

system  (1),  such  that  its  solution,  (?t»2t'*t>» 
in  some  sense  equivalent  to  {yt>Xt'*t)*  70  *alc* 
this  objective  explicit  it  needs  both  a 
specification  of  what  we  mean  by  a  time-reversion 
of  (1),  and  a  specification  of  the  desired  sense 
of  process  equivalence. 

By  a  reverse-time  system  we  mean  a  stochastic 
difference  equation  which  starts  at  time  T  and 
runs  in  negative  time  direction  on  the  interval 
(0,T).  He  require  from  a  time-reversion  of  system 
(1)  that  it  does  not  change  the  stats  space  and 
that  the  solution  of  the  resulting  rsverse-tlme 

system  represent*  the  process  (7t**t**t>*  Mora 

speciflcly,  {7t>&t>?t)  *u*t  be  the  solution  of  the 
following  system  of  stochastic  difference 
equations,  all  te(0,T-l): 


*(t*,rt+l*,t'st+l*0t>* 

(4. a) 

(4.b) 

B(t»^t+l«*t»*t+l»Jft*°t*ut>  * 

(4.c) 

where  X,  S  and  &  are  deterministic  mappings  of 

appropriate  dimensions  and  (®t>^t)  4  hoise 

sequence  to  be  specified.  For  a  better 
understanding  of  (4)  notice  that  the  substitutions 
of  («. a)  in  (4.c)  and  of  (4.b)  in  (4.a,c) 
transform  (4)  to  a  raverse-time  system  of  the  more 
common  form: 

"  *(t«*rt+l«*t4'l»®t*^t)  • 

“  ®(t**t+l**t4'l*^t)  • 

»  e(t,Xt+i>5ttn*°t**t*ut)  >  ts(o,T-i). 

To  be  a  useful  reverse-time  system,  (3t,9t)  should, 
as  much  as  possible,  be  Independent  of  tha  future 
(•  rcvefsed-tlme  past)  information  field 

»t+l  4  »((?s»J1«»,r«'9s^s*us)'  s6(t+l,T) )  • 

A  minimal  requirement  is  then,  that  the 

conditional  expectation  of  (®t*^t)»  given  >t+l» 

should  be  zero.  Because  **  •  decreasing 
sequence  of  sigma  algebras,  the  latter  can  most 
easily  be  put  in  martingale  language  (see  Elliott, 
1982:  Kumar  and  Varaiya,  1988:  and  the  definition* 
below) : 


Assume  (St;  t£[0,T))  is  an  incrm.lnq  sequence  of 
information  fields,  i.a.  any  s6(l,T). 

A  random  sequence,  (ft>  **  *»id  to  be  a  Martingale 
Difference  sequence  w.r.t.  ft  iff  for  all  te(0,T), 

(i)  bt  is  «t-neasurable, 

(ii)  E{ 1 1 1 1 )<•, 

(iii)  E(tt!«s)“°  «•••»  for  811  *eCO,t-lJ. 


Assume  (»t:  tG(0,T) ) 
information  fields, 


is  a  dfcrtafflnq  sequence  of 
i.e.  >sC>s-i>  any  se(i,T). 


A  random  sequence  (tt)  i*  said  to  be  a 


£t  iff  for  all  te(0,T), 

(i)  et  is  »t-m*asurable. 


(ii)  E{ | ttl )<•* 

(iii)  E(ttl,s)"°  t  «11  »6(t+l,T). 


Having  specified  the  desired  type  of  reverse-time 
system,  the  next  step  is  to  specify  the  types  of 
equivalence  of  solutions  of  systems  (1)  and  (4), 
in  which  we  are  interested.  For  stochastic 
processes  several  useful  types  of  equivalence  have 
been  defined  and  named  in  the  past.  He  restrict 
ourselves  to  the  two  most  important  types  of 
equivalence  and  their  unambiguous  names  (Elliott, 
19*2;  Jacod  and  Shiryaev,  1987): 

-  indistinguishable, 

'  -  equivalent  in  law. 

Definitions  are  given  below. 


2 _ ctXlniUaa 

Two  processes  (tt)  «nd  (tt|,  tG(0,TJ,  ere- said  to 
be  indistinguishable  It  they  are  defined  on  .the 
sene  probability  space  (a,»,P)  and 

P('«t  “  ?t  »  te[0,TJ  )  -  1.  (5) 

4 _ Definition 

Two  processes  (tt)  and  (et),  te(0,T],  ara'sald  to 
be  equivalent  In  law,  if  they  have  the  same  -state 
space,  X,  and  for  all  Oit^<t2<, ..tt^sT, 

P(«ti#../<t){)edX)  -  p((?ti,..,?t)<)edx)  ,  (6> 

for  any  X  and  all  Measurable  dXCE*. 

For  discrete-clue  processes  (5)  is  satisfied  if 

and  only  if,  for  all  t€(0,T],  et-tt  aiaost  surely, 
our  objective  in  the  sequel  is  to  obtain 
tiue-reversed  sys teas  of  type  (4),  with  solutions 
that  are  respectively  indistinguishable  and 
equivalent  In  law  w.r.t.  the  solution  of  (1). 

4  INDISTINGUISHABLE  TIHE-REVERSION 

In  this  section  we  derive  a  type  (4)  version  of 
systeu  (1),  such  that  their  solutions, 

(?t>)*t'*t)  4nd  (yt>xt'*t)<  ar*  indistinguishable, 
and  illustrate  these  results  for  a  juap-linear 
example. 

Tha  first  step  of  our  derivation  consists  of  a 
substitution  of  (2)  and  (3)  in  (1),  to  arrive  at 
the  in  section  2  discussed  tiue-reversed  systeu, 


xt  -  *  (*t+l»»t-xt+i/wt). 

(7. a) 

»t  "  b*«»t+1,vt), 

(7.b) 

yt  -  c(«t,xt,wt,ut). 

(7.c> 

Although  (7)  and  (4)  look  siailar. 

one  requirement 

is  not  net:  the  driving  noise  in  <7)  is  not  a 
reverse-tine  Hartingele  Difference  sequence  w.r.t. 
the  future  information  field 

»t  5  «<(ys,*s»»s'ws/vs<us)<’  *6(t,TJ|.  (8) 

Therefore  our  next  step  Is  to  introduce  a 
particular  revarsa-t iae  Martingale  Oiffarunce 
sequence,  (wt*,vt*J,  as  follows, 

<ut*»vt*)  -  <*t,vt)  *  (“tiVt)  »  (9 .a) 

with 

vt  a  E(vt|»t+1),  (9.b) 

*t  *  ®<wtl*t«*vtJ»  ail  tS(0,T-lJ.  (9.C) 

and  (wT*,vT*)»0. 

Notice  that  the  definition  of  wt  differs 
significantly  frou  the  reverse-fciue  predictable 
process  E( w^- 1 ) .  As  such  the  decospositlon  in 
(9)  is  not  the  unique  canonical  decospositlon  (see 
Jacod  end  Shlryaev,  19*7).  The  introduction  of 
this  non-canonical  decospositlon  is  a  crucial  step 
necessary  for  obtaining  the  time-reversion  of 
hybrid  state  systea  (1). 

In  the  sequel  ve  verify  that  (vt*,vt*j  is  indeed  a 
revarse-tlae  Martingale  Difference  sequence  w.r.t. 
tt,  and  thus  also  w.r.t.  a  tt  U  «({«,*,  v#*)j 
»S(t,T)).  Moreover  we  show  that,  due  to  the 


duality  of  the  Markov  property,  (wt,vt)  is 
conditionally  independent  of  Jt+2  given 
(*t+l»*tu>* 

5  -Jhaaxsa 

Aasuae  (wt,vt),  (vt,vt)  and  lwt*,vt*)  satiety  (1) 
and  (9).  Then  (wt*,vt*)  Is  s  reverse-tiae 
Martingale  difference  sequenco  w.r.t.  >t*»  while 

vt  and  vt  aatiafy: 

wt  "  *(vtl*t+l<*t<xt+l)»  (10. a) 

vt  -  *(vtl*t+l.xt+l)»  all  t6(0,T-l).  (10. b) 

Proof:  Sae  Dion  an*.  Bar-Shaloa  (1989). 

Theoraa  5  implies  that  wt  and  vt  can  ba  written  as 


vt  m  i(fc»*t+l*®t' xt+l )*  (11. a) 

vt  -  g(t,«t+1,xc+J.).  (11. b) 

Subatitution  of  (9. a)  and  (11. a, b)  in  (7.a,b,c) 
yields 

xc  -  a(t,Stn,*t,xt+1,w*t),  (12. a) 

*t  “  S(t,*t+i,Xt+i,v*t),  (12*b) 

yt  -  c(t,»t+1,»t,xt+1,xt,w*t,ut),  (12. c) 

with, 

I(t,#,a,x,w*)  -  a*(s,a,x,w*tf(t,s,a,x)),  (n.a) 

b(t,s,x,v*)  -  b*(»,v*+g(t,»,x)),  (13. b) 


c(t,*,a,x,z,w*,u)  -  c(a,r,w*+f(t,»,a,x),u).  (13. c) 
Tha  above  result  is  summarised  by  the  following 
corollary. 

-Corollary. 

Under  assumptions  A.l  end  A. 2,  the  solution 

(?t**t**t)  of  the  reverse-tiae  systea  (4)  is 
indistinguishable  froa  the  solution  (yt»xt'#t)  of 
systea  (1^  if 

(I)  (?T**T'*t)  “  (yT'xT'*T)  s.s., 

(XI)  I,  B  and  C  satisfy  (13.a,b,c), 

(III)  (Of^t)  "  («t#.Vt*>  *  «li  t6(0,T-l], 

with  w*t  snd  v*t  satisfying  (9. a)  and  (10). 

JUBB-UneAr-gxai"Plit 

To  illustrate  the  results  obtained  so  far,  1st  us 
consider  the  particular  situation  of  a  linear 
systaa  with  first  order  Markovian  switching 
coefficisnts  and  observation  noise  independent  of 
the  systea  driving  noise.  Both  s(#,a,x,w)  and 
c(4,x,w,u)  are  then  linear  in  (x,w),  while  tha 
first  is  4-invariant  and  the  second  is  w-lnvariant, 
by  which  systea  (1)  siaplifies  to, 

*t«  “  A(fltu)*t  +  B<«t+l)wt» 

*t*i  “  b(*t»vt) , 

yt  -  G(*t)xt  +  H(#t)ut. 

Then  froa  Corollary  6  w#  rsadily  find  the 
indistinguishable  tlae-reversed  systea, 

Xt  -  A-1(*t+l)  (xm  -  8(«ttl)  (Vt+W*t)), 

'•t  "  b*(«t+l.Vv‘t). 

yt  *  c(*t)xt  +  «(#t>ut* 


where  (v*t,v*t)  is  the  reverse-time  MD-sequenee  of 

Theorem  5,  vt-4(t,#t+^xt+1) 

snd  f,  g  and  b*  are  according  to  (iij  and  (IJ.b). 
Tha  difference  aquation  for  xt  is  siaiiar  to  tha 
ona  for  tha  linaar  Gaussian  example  in  ssction  2. 

But  dua  to  vt,  it  may  avan  ba  noniinaar  in,Xf*-l* 

At  tha  and  of  tha  nsxt  saction  we  vlil  show  that 
thara  ara  sons  furthar  amplifications  possible 
for  this  exanpla,  in  casa  of  in  probability  law 
aquivalanca. 

5.  EQUIVALENT  IN  LAW  TIKE-REVERSION 


P  *  (V|>)  « 

tf  tl,t+l'i,t'xt+l 

“Pula  aw  (W+Wt| . ) ,  (14. a) 

wtl®t+l'#t»xt+l 

whera  wt  satisflas  (10, a). 

With  this  our  remaining  stop  is  to  characterize 
tha  dansity  at  tha  right-hand  sida  of  (14. a)  by 
applying  Pay**  formula. 

8  Proposition 

Undar  assumptions  and  A. 4.  tha  distribution  in 
(iv)  of  Thaoraa  7  panaits  a  dansity  which  is 
characterized  by  (14. a)  and. 


In  this  section  wa  derive  conditions  undar  which 
tha  solutions  of  (1)  and  (4)  are  equivalent  in  law, 
and  discuss  these  results  for  a  juap-linear 
example,  so  far  our  line  of  reasoning  is  quits 
similar  to  tha  martingale  approach  of 
time-raversing  a  diffusion.  However,  things  ara 
quite  different  now  wa  require  equivalence  in  law 
only.  The  reason  is  that  while  in  tha  diffusion 

situation  this  requires  that  dwt  and  dwt  ara 
equivalent  in  law,  no  similar  simple  results  hold 
in  tha  discrete-time  situation.  Instead  of  this, 
wa  identify  the  relation  between  conditional  laws 

of  vt  and  wt  by  a  Bayesian  estimation  step.  Next 
wa  characterize  f  and  tha  required  law  of  v*t. 

7  Theorem 

Under  assumption  A.l  the  solution  (J’t^t'frt)  of 
reverse-time  system  (4)  is  equivalent  in  law  w.r.t. 
the  solution  (yc,*t,8t)  °f  system  (1)  if, 

(i)  P( (?T, *T< M edx)  “  P{(yT<*T*»T>edx>' 

for  any  measurable  dXCR*xRnxM, 

(ii)  5  and  S  satisfy  (13. a, c), 

(iii)  P(?t-’>l»t+i-a>stn“x}  “ 

-  P(»t*4|9t+1-8,Xt+1-Xj, 

(iv)  P(0t6dX|  (Xt+i,&t+i,»t)-<x<*','>)“ 

“  P(Wt*6dXl  (Xt+l,8t+l,et)-(X,8,i|)  ), 
all  (x,«,4,t)eRnxXJx(0,T-l]  and  measurable  dXCRP, 
with  w*t  and  f  satisfying  (9. a),  (10. a)  and  (ll.a). 

Proof:  Sea  Bios  and  Bar-Shalom  (1989). 


wtl»M-l<at»xtU 


.C(8,.,X,  P„t(.)  PXt,#t 


-  |vxa*T(8,n,x, .) | 
(a*(«,e,x, .) |n) ), 


(14. b) 


with  <fx  tha  gradient  and  c  either  a  normalizing 

factor  or  zero  iff  p  (x|#,n)«0. 

xt+ll*t+l«at 

Moreover, 


Proof:  See  Bloa  and  Bar-Shalom  (1989). 


JuBP-i lnear  example 

For  a  linear  system  with  first  order  Markovian 
switching  coefficients  we  arrived,  in  section  4, 
at  the  following  reversed-time  equation: 


xt  «  A-1(»t+l)  (xt+l  "  B(»t+1)(wt+w*t)], 
with  w*t  the  reverse-time  MD  sequence  and 


Ot-E(vt|«t+1,«t,xt+1|.  Because  a*  is  linear  in 
(x,w),  it#  gradient  w.r.t.  x  is  w-invariant,  by 
which  proposition  8  yields 


p  (v|«,>i,x)  *» 

wtl9t+i/#t'xt+l  , 

-  c.jL(0,n,x)pVt(w)pXt|#t(A'i(fl)(x-B(fl)wJ|n-). 

In  spite  of  the  simplification  this  is  a  form 
which  is  in  general  quite  complex,  by  which  wt 
still  may  be  a  nonlinear  function  of  Xt+%. 
Obviously,  this  type  of  complexity  could  have  bean 
expected,  as  it  is  well  known  that  a  discrete-time 
Bayesian  estimation  step  leads  to  nonlinear 
aquations,  unless  the  prior  densities  involved  are 
Caussian. 


Our  remaining  problem  is  the  characterization  of 
the  conditional  law  of  w*t.  As  this  is  actually  a 
discrete-time  nonlinear  filtering  problem,  it  can 
be  done  by  applying  Bayes  formula.  He  do  this 
undar  the  following  additional  assumptions: 

A.l.  The  a  priori  distribution  of  (xt,tt) 
permits  a  density-mass  function  for  all  t6(0,Tj. 

A. 4.  a*(e,n,x,w)  is  once  differentiable  in  xeRn 

for  all 

(»,*,v)6 H2xrP. 

If  the  distributions  in  (iv)  of  Theorem  7  have 
density-mass  functions  then  it  can  easily  be 
verified  that  (iv)  implies. 


6  CONCLUDING  REHARXS 

He  considered  the  problem  of  reversing  the  Markov 
solution  of  a  nonlinear  stochastic  difference 
equation  in  time.  The  nonlinearities  were  due  to 
nonlinear  coefficients  and  a  hybrid  stats  spacs, 
i.e.  a  product  of  an  Euclidsan  space  and  a 
discrete  sat.  To r  simplicity,  it  was  assumed  that 
the  process  in  tha  discrsta  sat  satisfies  the 
Markov  proparty.  Subssqusntly  ws  qav#  s  pracise 
description  of  our  ties  rsvsrsion  objective s:  the 
davalopmant  of  ties  ravsrsad  diffsrsncs  aquations, 
of  Jorms  similar  to  ths  original  equation,  but 
driven  by  reversed-time  martingale 
difference  sequences,  such  that  thair  solutions 
ara  respsctivsly  indistinguishable  from  end  in 


probability  lav  equivalent  to  the  solution  of  the 
original  aquation,  following  this  tha  derivation 
of  tha  indistinguishable  reverse-ties  aquation  was 
performed.  Tha  sain  natf  theoretical  result  is  tha 
introduction  and  avaluation  of  a  non-canonical 
(Jacod  and  Shiryaav,  1987)  ravarsa-tlsa  sartingale 
dacosposition,  which  is  appropriate  to  tha  hybrid 
stats  space  situation.  In  contrast  with  this,  all 
previous  reverie-time  equations  are  based  on  a 
canonical  sartingala  decomposition.  After  that,  it 
was  shown  how  tha  in  probability  law  equivalent 
tine  ravarsad  systen  can  be  obtained  by 
Introducing  an  appropriate  Bayesian  estisation 
step.  As  expected,  this  Bayesian  estisation  step 
leads  to  closed  font  equations  whose 
dimensionality  often  conplicates  further 
applications.  In  view  of  this,  in  Bloa  and 
Bar-Shaloa  (1989)  we  elaborate  the  Bayesian  step 
for  linear  systems  with  Markovian  switching 
coefficients  (jump-linear  systeas) ,  and  apply  the 
the  results  to  ssoothing  a  trajectory  with  fidden 
sanoeuvers. 
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Abstract.  A  realistic  stochastic  control  problem  for  hybrid  systems 
with  Markovian  Jump  parameters  may  have  the  switching  parameters  In  both  the 
stale  and  measurement  equations,  furthermore,  both  the  system  state  and  the 
Jump  states  may  not  be  perfectly  observed.  Currently  the  only  existing 
Implementable  controller  for  this  problem  Is  based  upon  a  heuristic  multiple 
model  partitioning  (MMP)  and  hypothesis  pruning.  In  this  paper  we  present  a 
stochastic  control  algorithm  for  stochastic  systems  with  Markovian  Jump 
parameters.  The  control  algorithm  Is  derived  through  the  use  of  stochastic 
dynamic  programming  and  Is  designed  to  be  used  for  realistic  stochastic  control 
problems,  l.e„  with  noisy  state  observations.  The  state  estimation  and  model 
Identification  Is  done  via  the  recently  developed  Interacting  Multiple  Model 
algorithm.  Simulation  results  show  that  a  substantial  reduction  In  cost  can  be 
obtained  by  this  new  control  algorithm  over  the  (MMP)  scheme. 

Keywords.  Stochastic  control:  Dynamic  programming;  Hybrid  systems: 
Multiple  model  partitioning;  Markovian  Jump  parameters. 


I.  INTRODUCTION 

An  Important  problem  of  engineering  concern 
Is  the  control  of  discrete-time  stochastic 
systems  with  parameters  that  may  switch  among  a 
finite  set  of  values.  In  this  paper  we  present 
the  development  of  a  controller  for  discrete-time 
hybrid  Jump-linear  Gaussian  systems.  Here  the 
state  and  measurement  equations  have  parameter 
matrices  which  are  functions  of  a  Markov 
switching  process.  The  Jump  states  are  not 
observed  and  only  the  state  Is  observed  In  the 
presence  of  noise. 

Along  with  presenting  a  desirable  practical 
control  algorithm  we  also  point  out  an 
Interesting  theoretical  phenomenon.  We  show  that 
there  Is  a  natural  connection  between  the 
Interacting  multiple  model  (IMM)  state  estimation 
algorithm  (Bt)  and  the  control  of  Jump-linear 
systems.  Thus  the  IMM  Is  the  state  estimation 
algorithm  of  choice  for  use  In  these  types  of 
control  problems. 

Systems  which  pertain  to  the  Jump-linear 
modelling  methodology  are  found  In  many  areas. 
Systems  of  a  highly  nonlinear  nature  can  be 
approximated  by  a  set  of  linearized  models  [M3, 

VI,  V2).  A  failure  In  a  component  of  a  dynamical 
system  (or  subsequent  repair)  can  be  represented 
by  a  sudden  change  In  the  systems  parameters  (82, 
SI,  Vl],  Also  economic  problems,  which  can  be 
modelled  by  parameters  that  are  subject  to  sudden 
changes  due  to  shortages  In  Important  materials 
(G2).  And  as  Is  noted  In  (M6)  there  also  exist 
applications  to  the  design  of  control  systems  for 
targe  flexible  structures  in  space. 

There  has  been  an  extensive  amount  of  work 
done  In  this  area  and  on  the  related  problem  of 
controlling  stochastic  dynamic  systems  with 
unknown,  time-invariant  parameters.  We  refer  the 
reader  to  the  (T3]  and  jG3|  for  a  list  of 
references  and  a  discussion  of  their  scope  and 

applications.  _ 
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Hore  recently  In  (S2)  a  feedforward/feedhack 
controller  was  presented  for  the  continuous-time 
problem  with  a  completely  observed  system  stale 
and  where  the  "modal  Indicator"  Is  measured  with 
a  high  quality  sensor.  In  (M6)  the 
continuous-time  Jump-linear  problem  Is  considered 
where  the  system  slate  and  “modal  processes"  are 
perfectly  observed.  The  optimal  regulator  was 
obtained  and  notions  of  stochastic 
stabillzablllty  and  detectability  were  Introduced 
to  characterize  the  behavior  of  the  optimal 
system  on  long  time  Intervals.  In  (M7I  the 
continuous-time  Jump-linear  problem  with  additive 
and  multiplicative  noises  and  noisy  measurements 
of  the  plant  state  was  considered  with  the  plant 
mode  assumed  perfectly  observed. 

In  (El)  a  sufficient  stability  test  Is  given 
for  checking  the  asymptotic  behavior  of  the  error 
Introduced  by  the  averaging  of  hybrid  systems. 

In  (M8J  the  continuous-time  Jump-linear  problem 
with  non-Markovlan  regime  changes  was 
considered.  A  control  scheme  was  presented  for 
the  case  of  perfect  ovservatlons  of.  the  system 
stale  and  plant  regime. 

In  (C3)  a  discrete-time  Markovian  Jump 
optimal  control  problem  was. considered.  The 
controller  Is  for  the  case  of  perfect  system 
state  observations  and  known  form  process.  They 
derive  necessary  and  sufficient  conditions  for 
the  existence  of  optimal  constant  control  laws 
which  stabilize  the  controlled  system  as  the  time 
horizon  becomes  Infinite.  Through  examples  they 
show  the  Interesting  result  that  stabillzablllty 
of  the  system  In  each  form  Is  neither  necessary 
nor  sufficient  for  the  existence  of  a  stable 
steady-state  closed-loop  system. 

In  (YI1  a  discrete-time  system  with  perfect 
stateand  mode  information  was  considered.  A 
controller  was  presented  which  is  stabilizing  in 
the  mean  square  exponential  senst. 

As  pointed  out  In  (G2I,  we  generally  cannot 
determine  the  optimal  Jump-linear  quadratic 
Gaussian  closed-loop  control  law  analytically 


«v«n  i or  •  two-sup  problem.  In  order  to  compute 
the  optimal  control  extensive  numerical  search 
, methods  must  be  employed  and  thus  one  would  like 
to  find  simpler  suboptlmal  control  schemes. 

Currently  the  only  existing  Implementable 
controller  for  this  problem  (switching  parameters  . 

In  the  system  state  and  measurement  equations  and 
noisy  state  observations),  Is  the  one  discussed 
In  (T3)  and  Is  of  the  OLOF  class.  This  algorithm 
Is  based  upon  a  heuristic  multiple  model 
partitioning  (MNP)  and  hypothesis  pruning.  The 
MNP  approach,  being  simple  and  straightforward  to 
implement,  Is  a  reasonable  choice  for  the  unknown 
parameter  problem  (U),  and  as  shown  In  (T3|  It 
works  well  for  applications  Involving  switching 
parameters  In  the  state  measurement  equation 
only.  For  the  non-swltchlng  parameter  problem 
the  operating  mode  Is  determined  to  a  high 
probability  In  a  relatively  short  period  of  time 
and  the  MMP  approach  gives  the  linear  quadratic 
Gaussian  optimal  control. 

For  switching  parameter  problems  a  different 
situation  exists.  Here  because  of  the  switching 
the  operating  mode  may  not  be  determined  to  a 
high  probability.  The  proposed  approach  to 
deriving  a  suboptlmal  control  scheme  Is  to  start 
with  the  solution  to  the  optimal  control  problem 
via  the  use  of  stochastic  dynamic  programming. 

By  utilizing  dynamic  programming  and  making 
appropriate  suboptlmal  assumptions  the  use  of 
numerical  search  methods  has  been  avoided.  We 
thus  have  developed  a  multiple  model  control 
scheme  which  has  the  following  desirable 
properties:  (a)  It  gives  the  optimal  final 
control,  (b)  the  algorithm  utilizes  the  IMM  state 
estimation  scheme,  and  (c)  It  has  the  same 
property  as  the  MMP  approach  In  that  It  gives  the 
optimal  linear  quadratic  control  under  the 
assumption  of  a  perfectly  known  model  history 
sequence  (which  Is  however  an  unrealistic 
assumption  for  this  class  of  problems). 

For  comparison  purposes  we  Implement  the 
"switching  parameters  In  the  system  state 
equation"  controller,  proposed  (but  not  tested) 

In  (T3J.  We  show  via  example  that  a 
statistically  significant  reduction  In  cost  can 
be  achieved  through  the  use  of  our  controller 
which  also  belongs  to  the  OLOF  class. 

The  paper  Is  outlined  as  follows.  In  section 
2  the  problem  formulation  Is  given.  In  section  3 
an  Interesting  connection  between  the  IMH  state 
estimation  algorithm  and  the  control  of  multiple 
model  systems  Is  shown  to  exist.  In  section  1  we 
obtain  the  control  algorithm.  A  new  "fuU-tree" 
control  algorithm  Is  derived  which  utilizes  all 
possible  future  parameter  history  sequences,  in 
section  S  we  use  simulations  to  compare  the  MMP 
control  algorithm  with  the  full-tree  controller. 

Z  PROBLEM  FORMULATION 

The  problem  to  be  solved,  Is  discussed  next. 

Ve  took  the  pragmatic  approach  of  starting  with 
the  available  mathematical  and  statistical  tools 
found  to  yield  success  in  solving  similar 
problems  of  this  type  In  the  past  (l.e.,  use  Is 
made  of  the  stochastic  dynamic  programming  method 
and  the  total  probability  theorem,  etc.).  As  we 
shall  see,  not  only  does  this  practical 
engineering  approach  yield  an  Improved  multiple 
model  control  algorithm,  but  it  also  leads  to  the 
Interesting  theoretical  observation  of  a  direct 
connection  between  the  IMM  state  estimation 
algorithm  and  Jump-linear  control. 

It  is  desired  to  find  a  sequence  of  causal 
control  values  to  minimize  the  cost  functional 


J  -  E(cCO)}.E(x(N)'Q(N)xfN)*Hi;  (x(k)'Q(klx(k) 
♦u(k)’R(k)u(k)Jj  (2.D 

where  Q(k)zO  for  each  k«0,I,...N  and  and  It  Is 

sufficient  that  R(k)>0  for  each  k*0,I . N-i. 

The  discrete-time  system  state  and 
measurement  modeling  equations  are 

xtk)  -  FJM(k))x(k-l)  ♦  G|M(k))u(k-t) 

♦  v|k-l.M(k))  (2.2a) 

z(k)  ■  H(M(k))x(k)  ♦  w[k,M(k))  k-0.1.2....  (2.2b) 

where  x(k)  Is  an  nxl  system  state  vector,  u(k)  is 
an  pxl  control  Input,  and  z(k)  is  an  mxl  system 
stale  observation  vector.  The  argument  M(k) 
denotes  the  model  “ai  time  k”  -  In  effect  during 
the  sampling  period  ending  at  k.  The  process  and 
measurement  noise  sequences,  vlk-l.Mlk))  and 
wlk.M(k)),  are  white  and  mutually  uncorrelated. 

The  model  at  time  k  Is  assumed  to  be  among  a 
finite  set  of  r  models 


M(k)  c  (1.2 . r) 

(2.3) 

example 

F(M(k)-i)  -  F, 

(2.1) 

vtk-l.M(k)-l)  - 

(2.5) 

w(K,M(k)»Jl  -  /FIKj.Wjl 

(2.G) 

l.e.,  the  structure  of  the  system  and/or  the 
statistics  of  the  noises  might  be  different  from 
model  to  model. 

The  model  switching  process  to  be  considered 
here  Is  of  the  Markov  type.  The  process  Is 
specified  by  a  transition  matrix  with  elements 
p„.  Let 

lk  £  (z(O).z(I) . z(k),u(0),u(l) . u(k-l))  (2.7) 

denote  the  Information  available  to  the 
controller  at  lime  k  (l.e.  the  control  Is 
causal). 

3.  THE  LAST  STAGE  CONTROL  AND  THE  CONNECTION 

WITH  THE  IMM  ESTIMATOR 

An  Integral  part  of  any  control  algorithm  for 
this  class  of  problems  Is  the  system  state 
estimator.  In  this  section  we  show  that  there 
exists  an  Interesting  connection  between  the 
control  of  multiple  model  stochastic  systems  and 
the  IMH  system  state  estimator  (811.  To  this  end 
we  start  by  solving  for  the  time  N-t  optimal 
control.  The  optimal  control  at  time  N-i,  Is  the 
value  of  u(N-i)  which  minimizes 

J(N-l)  -  e{x(N-1)'Q(N-1)x(N-1)*u(N-1)'R(N-1)u(N-1) 
»x(N)'Q|N)x(N)||k'jJ 

-  £  E{x(N-l)'QlN-l)x(N-l)*u(N-l)’R(N-t)u(N-i) 

)*»  1 

♦x(N)'Q(N)x(N)|ik'\M(NH} 

•  R(H(N)*J|1*”,J  (31) 


(3.2J 


u/NiN-n  ft  piminwii"'1) 

and  use  the  state  equation  (2.2a)  and  (2.4), 

(2.5)  In  (3.1J. .  to  ,  get 

J(N-l)  -  £  E{x(N-i)'[Q(N-l)*ri*Q(N)FjjxlN-n 

♦2u(N-l)'GJ'Q(N)FjX(N-l)*u(N-l)'[R(N-l)»Gj‘Q(N)Cj[ 

•  u(H-l)|lN*i,MlN)«j}|»J(N|N-l) 

♦  £  tr(Q(N)V.|u,(N|N-t)  (3.3) 

)*»  ' 

Now  taking  the  partial  of  (3.3)  w.r.t.  u(N-l)  and 
setting  It  to  zero  yields 

u*(N-l)  -  -[r(N-1)*£  GjG(N)GjUj(N|N-l)] 

•  £  G'0(N)F|E{x(N-l)j!H'*.M(N)-j} 

•  Pj(N)N-l) 

Notice  that 

e{x(N-I)|iM~1.M(N1»j)  *  £  E{x(N-l)|lH-\«(N)-j. 

h(N-l)-l}  P(M(N-1MIM(N)»).IH**1  (3.5) 


J'(k,lk)  ft  mly  E(x(k)‘Q(k)x(k)*u(k)'R(k)u(k) 

♦  J*(k*l,lk“)|lk}  14.3) 

where  J*(k.Iv)  Is  the  optimal  cost-to-go  from 
time  k  to  the  end.  Now  applying  the  total 
probability  theorem  to  (1.31  yields 

rM-k»2 

J,(k.«k)  -  min  £  (E(x(kl'Q(k)x(k)  ♦  u(k)’R(k)u(k) 
uU)  |t|  '  ' 

♦  J,(k*l.lk,')|M‘,‘w.lk}P(MwwJllM) 

The  control  that  minimizes  an  approximation 
to  (4.4)  Is  derived  In  the  Appendix,  and  Is  given 
as 

u"(k)  -  -  (R(k)  ♦  'V  '  GltM  p’(k.l)  P,lN|k*l)  j1 
.  fM'£J  c;kM  P'(kM)Fv>i  S01(k|k)n,lN|kMl  (1.5) 

and  again  we  see  the  natural  way  the  IMM  mixed 
initial  estimates  show  up. 

Note  that  the  control  parameters  p'(k) 
(modeJ-history-condltloned  optimal  cost  matrices) 
are  computable  off-line. 


where,  since  M(N)-J  In  the  first  conditioning  Is 
irrelevant,  the  expectation  Inside  the  summation 
Is 

E{x(N-t)|lH-,.M(N)-l}  «  £  Xj(N-!|N-l)un.(N-t|N-l) 

±  x0j(N-l|N-l)  12 

which  is  the  IMM  mixed  Initial  estimate  IB1). 

Thus  using  (3.6)  In  (3.4)  we  get 

u*(N-l)  -  -|R(N-t)*£c’jQ(N)Pj(N|N-l)]  ‘ 

*  £g*,Q(N)Fix0^(N-1|N-IJii.(N|N-I)  ( 

yi  J  1  ‘ 


4.  THE  CONTROL  ALGORITHM 

We  will  derive  a  full-tree  control  algorithm 
(FT)  which  computes  control  values  by  taking  Into 
account  all  possible  future  model  histories.  As 
will  be  seen  by  our  example  this  method  offers 
Improved  performance  over  the  existing  scheme 
I T31. 

The  l-th  future  history  of  modiis  Is 
denoted  as 

Mwu  -  (M(kHk . M(N)«Im>  1-1 . r’"‘M  (1. 

where  l|  Is  the  model  at  time  !  from 
history  I  and 

I  s  I,  s  r  l-k . N  (4 


5.  SIMULATION  RESULTS 

The  FT  controller  developed  In  Sec.  4  Is  used 
to  control  the  state  trajectory  of  the  system. 

The  performance  of  this  algorithm,  as  determined 
by  (2.1),  Is  compared  to  the  cost  obtainable  by 
using  the  MMP  controller  discussed  In  1T3).  In 
order  to  obtain  a  meaningful  comparison  we  use 
the  rigorous  statistical  analysis  technique 
presented  In  (B5,  W3). 

The  control  of  a  double  Integrator  system 
with  process  and  measurement  noises  Is  considered 
with  a  gain  failure.  The  two  possible  models  are 
given  by  the  following  system  equation 


x'(k»l) 


v(k>  131,2 


with  measurement  equation 

z(k)  *•  (I  OJ  x'(k)  ♦  w(k)  ( 

The  models  differ  In  the  control  gain  parameter 
b!.  The  process  and  measurement  noises  are 
mutually  uncorrelated  with  zero  mean  and 
variances  given  by 


El v(k|  v(J))  -  0.16  6k) 


Elwlk)  wl}})  ■  8kj 

The  control  gain  parameters  were  chosen  to  be 
b‘-2  and  bJ-0.5. 

The  Martkov  transition  matrix  was  selected  to 
be 


,  L  0.!  0.9  J 


For  this  example  N«7,  and  the  cost  parameters 
R(k)  and  Q(k),  (see  (2.1)),  were  selected  as 


R(k)  «  5.0 


k«l,2 . N-t 


§:8  8:8 

3:8  8:8 

3:8  8:8 

8:8  8:8 


the  FT  controller  performs  better  than  the  MMP 
controller  for  this  problem.  The  estimated 
Improvement  (decrease  In  cost)  of  70X  Is 
statistically  significant. 


TABLE  II 

STATISTICAL  TEST  FOR  ALGORITHM  COMPARISONS 


Test,  Estimated 

Statistic  Improvement 

A 

°i  l/ol  x 

FT-MMP 

13,956 

3,316  9.1  70 

3:8  8:8 

i°d°  8:8 


where  the  last  matrix.  Q(7),  reflects  our  desire 
to  drive  x,(7)  vigorously  to  zero.  Also  note 
that  for  this  example  T»1.0. 

The  real  system  was  Initialized  with 
x(0)»(30.0,  0.0)'  and  a  random  selection  was  done 
for  choosing  the  Initial  model  with 
P|M(0)*l)*0.5,  1*1,2.  The  Kalman  filters  each 
received  an  initial  state  covariance  of 


S] 

and  the  Initial  slate  estimate  was  selected  as 


r  x^oio)  “ 

r  z(di  i 

L  Xj(0|0)  _ 

Lzio)  -  zi-i)  J 

where  z(-l)  »  30.0  ♦  w(-l)  and  z(0)  •  30.0  ♦ 
w(0). 

Statistical  tests  were  made  on  the  results  of 
50  Monte  Carlo  runs.  Sample  means  and  variances 
of  the  Monte  Carlo  costs  C,  defined  In  (2.1) 
were  computed  for  the  FT,  MMP.  and  "known 
model-history"  (l.e.  optimum  linear  quadratic) 
controllers. 

Table  I  contains  the  results.  The  FT 
algorithm  shows  a  clear  reduction  in  cost  as 
compared  with  the  MMP  scheme.  However  In  order 
to  provide  a  rigorous  argument  that  the  actual 
performance  Is  ordered  as  Table  I  Indicates  we 
apply  the  statistical  test  presented  in  (05,  W3). 

Table  II  contains  the  results.  The  sample 

standard  deviation  Or  of  the  mean  of 

u 

the  cost  differences,  C”Hf-c[I.  are  shown. 

The  hypothesis  that  the  FT  controller  Is  better 
than  the  MMP  scheme  can  be  accepted  only  If  the 
probability  of  error  a  Is  less  than,  say,  1 
percent.  Then  the  threshold  against  which  we 
compare  the  test  statistic  Z/Oj  Is 
P*2.33.  This  test  statistic  has  to  exceed  the 
threshold  In  order  to  accept  the  hypothesis. 


TA8LE  I 

SAMPLE  AVERAGE  COSTS  AND  STAN0AR0  DEVIATIONS 


Hodel-fttslory 

FT 

MMP 

Sample  Mean 

2,697 

6,063 

19,519 

Sample  Standard 
Deviation 

8,096 

3.96E5 

1.12E7 

6.  CONCLUSION 

The  development  of  a  new  control  algorithm 
for  discrete-time  hybrid  stochastic  systems  with 
Markovian  Jump  parameters  has  been  presented. 

This  contoller  was  derived  through  the  use  of 
stochastic  dynamic  programming  and  by  taking  into 
account  all  possible  future  "histories  of 
models".  This  scheme  uses  the  IMM  state 
estimation  algorithm.  We  show  that  there  Is  an 
Interesting  connection  between  the  IMM  state 
estimator  and  control  of  Jump-linear  hybrid 
systems.  This  new  controller  Is  of  the  OLOF 
class  and  has  off-line  computable  control  gain 
parameters. 

From  the  example  It  is  seen  that  this  scheme 
can  achieve  a  statistically  significant  reduction  , 
in  cost  when  compared  to  the  multiple  model 
partitioning  approach. 


APPENDIX 

I,  Derivation  of  (9,5) 

Note  that  given  the  future  history  of 
models  Mk'WJ,  the  optimal  cost-to-go 
J’(k*l,lkM)  Is  easily  computed  and  Is 
denoted. 

J,*(k*l,lM)  &  E{x(k*l)'P,(kM)x(k*l)jlkn.M'“w} 

♦  a'lkM)  l  A.l ) 

where  the  notation  from  (B9)  Is  used  for  P(k*l) 
and  «(k»l). 

Since  the  expectation  In  (9.9)  Is  conditioned 
on  Mk*1-MJ,  we  obtain  our  of  approximation 
by  replacing  J*  ( k  ♦  1 , 11*1 )  Inside  the 
expectation  with  (A.l),  and  (9.9)  becomes 

J*(k,lk)  **  mly  £  (E(x(k)'Q(kJx(k)  ♦  u(k)'R(k)u(k) 

♦  E{x(k*l)'  P,(kM)x(k*l)[lk“.MkMjy} 

♦  «I(k*i)|M“WJ,lk}p1(N|k*ij)  (A.2) 


where 


t»,(N|k*lJ  6  P(Mk*,’NJ|Ik|  (A.3) 

Now  use  (2.2a)  and  apply  the  smoothing 
property  of  expectation  to  (A.2)  to  get 

fX-k*: 

J*(k,lk) «  mlji  I  (E(x(kl'Q(k)x(k)  ♦  u(k)'R(k)u(k) 

♦  [Fk  ^(k^G^utk)  ♦  v!k-l.lk.,)fp,(kM){.) 

♦  «l(k*l)|Mk,|'M'1,ll)u1(N|k*l)j  (A.1) 


Take  the  partial  w.r.t.  u(k(  of  (AM)  and  set  to 
zero  to  solve  for 


u‘(k)-(R(k)*  'Z  G(l>lP,(k»l)G^^tP|(N|k*l)|'1 

rH-k*2 

’  g  Giwp,lk*nr^|E{*(k)|H“uu.lk}iilIMlk*l|  (A.5) 

We  still  need  to  evaluate  the  expectation  In 
(A.S).  This  Is  done  as  follows.  Note  that 

x(k)  Is  Independent  of  M(l),  l-k*2 . N  If 

M(k»l)  Is  known,  thus 

e(x(k)|Mk*uu1lk)  «  E(x(k)|M(k«l)  -  lk<,.lk)  (A. 6) 

But  (A. 6)  Is  x“(k|k),  the  IMM  mixed 
Initial  estimate  (see  (3.6)1,  thus  using  (A.6)  In 
(A.S).  we  get  (1.5). 
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ABSTRACT 

Piecewise  Deterministic  (PD)  Markov  processes  form  a 
remarkable  class  of  hybrid  state  processes  because, 
in  contrast  to  moot  other  hybrid  state  processes, 
they  include  a  jump  reflecting  boundary  and  exclude 
diffusion.  As  such,  they  cover  a  wide  variety  of 
impulsively  or  singularly  controlled  non-diffusion 
processes.  Because  PD  processes  are  defined  in  a 
pathwise  way,  they  provide  a  framework  to  study  the 
control  of  non-diffusion  processes  along  the  same 
lines  as  that  of  diffusions.  An  important 
generalization  is  to  include  diffusion  in  PD 
processes,  but,  as  pointed  out  by  Davis,  combining 
diffusion  with  a  jump  reflecting  boundary  seems  not 
possible  within  the  present  definition  of  PD 
processes.  This  paper  presents  PD  processes  as 
pathwise  unique  solutions  of  an  Its  stochastic 
differential  equation  (SDE) ,  driven  by  a  Poisson 
random  measure.  Since  such  an  SDE  permits  the 
inclusion  of  diffusion,  this  approach  leads  to  a  — 
large  variety  of  piecewise  diffusion  Markov 
processes,  represented  by  pathwise  unique  SDE 
solutions. 

1a _ INTRODUCTION 

Because  many  of  the  stochastic  processes  that  we 
meet  in  nature  have  a  state  space  that  is  a  product 
of  a  continuous  space  and  a  discrete  set,  we  often 
need  pathwise  models  on  such  a  hybrid  state  space. 

As  a  result,  several  classes  of  hybrid  state  space 
models  have  been  developed,  such  as  systems  with 
Markovian  switching  coefficients,  doubly  stochastic 
counting  processes  and  Markov  decision  drift 
processes.  These  models  are  used  in  quite  different 
fields  of  applications,  by  which  their  studies  have 
often  evolved  separately,  one  reason  to  study  hybrid 
state  space  processes  within  a  common  framework  is 
that  their  martingale  parts  are  in  general 
discontinuous.  This  property  has  attracted  a  lot  of 
attention,  and  is  by  now  very  well  documented 
(Jacod,  1979;  Cinlar  et  al.,  1980;  Brenaud,  1981; 
Elliott,  1982;  Bensoussan  and  Lions,  1984;  Ethier 
and  Kurtz,  1986;  Jacod  and  Shiryaev,  1987).  It  is 
quite  clear  from  these  results  that,  to  study  hybrid 
state  Markov  processes  along  the  same  lines  as 
diffusions,  we  need  both  pathwise  representations 
and  strong  Markov  (martingale)  characterizations  of 
those  processes.  Unfortunately,  for  hybrid  state 
Markov  processes  there  is  presently  a  lacuna  of 
pathwise  representations  with  strong  Markov 
characterizations.  This  lacuna  is  apparent  if  we 
depict  the  main  classes  of  hybrid  state  Markov 
processes  in  the  fora  of  a  Venn-diagram  (fig.  1). 


There  exist  pathwise  representations  with  strong 
Markov  characterizations  of  counting  processes  with 
diffusion  intensity  (Snyder,  1975;  Marcus,  1978),  of 
diffusions  with  Markovian  switching  coefficients 
(Wonham,  1970;  Brockett  and  Blankenship,  1977)  and 
of  piecewise  Deterministic  (PD)  Markov  processes 
(Davis,  1984).  For  many  other  Markov  processes  in 
figure  1,  there  exist  only  strong  Markov 
characterizations  (Kingman,  1975;.  Anulova,  1979, 
1982;  Bensoussan  and  Lions,  1984;  Baibas  and 
Lenhart,  1986).  Actually,  PD  Markov  processes  seem 
the  most  interesting  of  all  processes  in  figure  1, 
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as  they  provide  pathwise  representations  with  a 
strong  Markov  characterization  of  all  major  non¬ 
diffusion  Markov  processes.  A3  such,  PD  Markov 
processes  provide  a  framework  to  study  Markov 
decision  drift  processes  (Hordijk  and  Van  der  Duyn 
Schouten,  1983;  Yushkevich,  1983)  along  the  same 
line  as  diffusions  (Vermes,  1985).  With  this,  an 
interesting  generalization  is  to  extend  the  spectrum 
of  hybrid  state  Markov  processes  by  including 
diffusion  into  PD  Markov  processes.  As  the  present 
definition  of  PD  processes  does  not  seem  to  have  an 
opening  left  for  that  inclusion  (Davis,  1984),  we 
need  a  different  approach. 


Fig.  1.  Main  classes  of  hybrid  state  Markov 


processes. 

The  approach  that  overcomes  this  difficulty, 
presented  in  the  sequel,  is  to  assume  a  stochastic 
differential  equation  (SDE)  in  a  hybrid  space  and  to 
construct  a  large  class  of  piecewise  diffusion 
Markov  processes  from  it.  With. respect  to  the  state 
space  we  restrict  our  attention  to  a  hybrid  subset 
of  a  Euclidean  space.  Then  the  most  general  SDE  is 
of  Its  type,  driven  by  Brownian  motion,  w,  and  a 
Poisson  random  measure,  p  on  (0,<>>)xa, 

dEt  -  a(Et)dt  +  8(Et)dwt  +  jr  ^(St-^J  P(dt,du) . 

The  path  of  a  solution  of  this  SDE  is  right 
continuous  and  has  left  hand  limits:  If  “  *t-A* 

If  p  generates  a  multivariate  point  (t,ut),  then  the 
path  of  £  has  a  discontinuity: 

“  «t-  +  *(«t-'ut>- 

In  the  sequel  we  shall  focus  on  pathwise  unique 
solutions.  The  classical  result  for  the  existence  of 
such  solutions  requires  that  ^  is  sufficiently 
continuous  (Gihman  and  skorohod,  1972) ,  which 
restricts  the  SDE  essentially  to  systems  with 
Markovian  switching  coefficients.  However,  there  are 
some  non-classical  pathwise  uniqueness  results  that 
allow  a  discontinuous  f  (Lepeltier  and  Marchal, 

1976;  Jacod  and  Protter,  1982;  Veretennikov,  1988)  , 
Taking  these  results  as  a  starting  point,,  we 
introduce  and  evaluate  a  particular  structure  for  * 
in  section  2.  This  structure  poses  hardly  any 
restrictions  on  the  possible  solution  of  the  SDE, 
while  it  enables  a  separata  evaluation  of  an 
unbounded  jump  intensity  and  a  hybrid  state  space 
situation.  In  view  of  this  separation,  we  first 
consider,  in  sections  3  and  4,  the  modelling  of  a 
jump  reflecting  boundary  in  R"  through  ah  unbounded 
jump  intensity,  and  after  that,  in  section  5,  we 
consider  tho  hybrid  state  situation. 

Assume  an  open  subset  0  of  Rn  with  jump  reflecting 
boundary  30,  which  means  that  (?t)  undergoes  an 


instantaneous  jump  into  tha  interior  of  0  if  (5ti 
trios  to  cross  or  to  travel  through  30.  To  nodal 
this  with  tha  above  SDE,  the  Poisson  random  neasure 
p  should  instantaneously  generate  a  point  when  (It) 
enters  30.  However,  this  is  not  possible  as  a 
Poisson  random  neasure  generates  almost  surely  no 
point  at  an  entrance  tine.  To  overcome  this  problem, 
we  briefly  discuss  the  following  three  approaches: 

1.  Replace  p  by  a  random  measure,  with  almost 
surely  one  point  at  an  arbitrary  time. 

2.  Assume  a  *  such  that  p  generates  an  active  point 
during  an  infinitesimal  small  time  interval 
after  entering  30. 

3.  Assume  a  such  that  p  generates  an  active  point 
during  an  infinitesimal  small  tine  interval  just 
before  entering  30. 

Approach  1  adequately  solves  the  instantaneous  jump 
problem  but  creates  many  new  problems,  because  if  p 
is  not  a  Poisson  random  measure,  then  the  SDE  can 
not  be  analysed  within  the  powerful  It6  framework. 
Approach  2  Is  the  well  known  approach  of  randomized 
stopping  (Bensoussan  and  Lions,  1984).  As  this 
approach  allows  (tt)  to  cross  or  to  travel  through 
30,  the  resulting  process  is  at  best  a  modification 
of  a  PD  Markov  process.  Approach  3  is  tha  desired 
solution.  However,  the  problem  with  approach  3  is 
that  it  is  in  general  not  known  how  to  carry  it  out. 
A  constructive  answer  to  this  will  be  given  in  the 
sequel.  It  is  clear  that  approach  3  needs  a  kind  of 
prediction  of  the  tine  that  (Et)  might,  otherwise, 
enter  30.  Actually,  PD  Markov  processes  are 
presently  the  only  processes  for  which  this 
prediction  problem  is  solved  (Davis,  1984).  As  such, 
we  first  formulate  that  solution  in  an  SDE  set  up  in 
section  3.  Next,  in  section  4,  we  present  a  solution 
of  the  prediction  problem  for  the  situation  with 
diffusion. 


Finally,  in  section  5,  we  explicitly  consider  the 
hybrid  state  space  situation.  The  most  interesting 
effect  of  the  hybrid  state  space  assumption  is  that 
it  leads  to  a  particular  type  of  jumps:  jumps  in  the 
continuous  state  component  of  (E^)  that  anticipate  a 
simultaneous  transition  of  the  discrete  component  of 
(Et).  This  type  of  jumps  have  been  introduced  by 
Gnedenko  and  Kovalenko  (1968)  for  piecewise  linear 
processes  and  by  sworder  (1972)  for  systems  with 
Markovian  switching  coefficients.  For  short  we  refer 
to  these  anticipating  simultaneous  jumps  as  hybrid- 
lumps.  The  SDE  framework  of  this  paper  provides  an 
elegant  way  of  representing  the  hybrid  jumps  of  PD 
Markov  processes  and  their  piecewise  diffusion 
generalizations. 


Some  other  interesting  generalizations  of  PD  Markov 

Processes,  not  considered  in  the  sequel,  are  the 
delusion  of  continou3ly  reflecting  or  sticky 
boundaries.  The  inclusion  of  a  continuously 
reflecting  boundary,  while  preserving  pathwise 
uniqueness,  seems  possible  if  that  boundary  is 
smooth  enough  (Chaleyat-Maurel  et  al.,  1980;  Menaldi 
and  Robin,  1985;  Frankowska,  1985;  Saisho,  1987). 

The  inclusion  of  a  sticky  boundary  without  loosing 
pathwise  uniqueness  seems  difficult  if  not 
impossible,  but  strong  Harkov  characterizations  are 
possible  (Kingman,  1975;  Anulova,  1979,  1982). 
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^t  :  i-th  component  of  process  ?«.. 

3°  :  boundary  of  tha  closure  of  set  o. 

Int(x)  :  integer  part  of  x. 

X  :  x (True)-i  and  X(False)»o. 

:  continuous  with  left  hand  limits 

c  (0)  :  the  sot  of  all  real-valued  functions  that 

are  k, times  contihuously  differentiable  on 
0.  The  superscript  is  deleted  if  k-0.  if  k 
is  followed  by  b,  then  f  and  its  first  k 
derivatives  are  bounded  on  0. 
o(<)  :  domain  of  operator  A. 


2  THE  SDE  OF  LEPELTIER  AND  HARCKAL 

We  assume  a  stochastic  basis  (Q,?,F,P),  endowed  with 
an  m-dimensional  standard  Wiener  process,  (Wt),  and 
a  Poisson  random  measure,  p(dt,du)  on  Ri_xRd'fl:  /jacocj 
and  Shiryaev,  1987,  p.  70),  with  intensity  measure 
dtxm(du),  and  consider  the  following  stochastic 
differential  equation  (SDE)  in  R+xR", 
d?t  “  «(*t)dt  +  0(*t)dwt  +  R-£Rd  *(«t-'u>  q(dt,du)  + 

+  R+xRd’*(l: t-'u>  P(dt'du>  *  (D 

whore  q  is  the  martingale  measure  of  p,  Eo  As  an 
Sg-measurable  random  variable,  while  a,  8  and  *  are 
measurable  mappings  of  appropriate  dimensions. 

The  classical  reference  for  equation  (1)  is  Gihman 
and  Skorohod  (1972) .  significant  extensions  of  their 
results  have  been  obtained  by  Lopeltier  and  Marchal 
(1976)  in  their  study  of  the  relation  between  an 
integro-differential  operator  and  an  SDE  of  typo 
(1).  Their  particular  SDE  can  easily  be  obtained 
from  (1),  by  introducing  homeomorphism  mappings  of 
R"xR°  into  (u6Rd+1;o<|u|£i)  and  of  R+xRd  into 
{uSRcH'l;l<|u|<m},  and  subsequently  transforming  m 
and  i  correspondingly.  Consequently,  the  results  of 
Lopeltier  and  Marchal  can  immediately  be  used  in  the 
present  study  of  (1),  while  allowing  the  intensity 
of  the  active  points  in  R+  to  be  unbounded  outside 
soma  known  Borel  set  0'CRn. 

ftssumBtlpng 

AjJL  Th  ere  is  a  constant  K  such  that,  for  all  E6Rn, 
l«(C)l2  +  J0(Ol2  +R-kd  l*«'u>ra(du)  £  K(1+1E|2). 


A' . 3  0'  is  a  known  Borel  subset  of  Rn, 

R+xRd  x *  *(*'u)*0  )»(du)  is  uniformly  bounded  on  O', 

and  [E+*(E,u))  e  O',  for  all  E6Rn,  uGRd+1. 

A' .4  For  all  k6N  there  exists  a  constant  Mk,  such 
that,  with  Bk  the  ball  of  A. 2: 

&.  for  all  EeBkno', 

R+xRd  ra(du>  *  Mk* 

fe.  for  all  EeBkn(Rn-0'), 

J  £d  l'H«,u>l  m(du)  £  Mk, 

given  that,  for  ail  u6R+xRd, 

*(E,u)  “  *(E,u+Col(l,0,..,0)). 

A'  ■  5  For  all  r6N  there  is  a  constant  Nr,  such  that 
E(  l  R+/Rd  X(  *(Es_,u)*0  )  p(ds,du)>  £  Nr . 


2oJL _ EcaBaal&lan 

Given  m(du)«du1xs(dii)  and  assumptions  A.  1.  A. 2. 

A1 . 3 ■  A  * ■ 4 .  A' ■ 5  are  satisfied.  Then  equation  (1) 
has  for  any  Eo£0'  a  pathwise  unique  solution,  { E t } • 
Moreover  (Et)  is  then  a  right  continuous  Markov 
process. 

Remark:  Proposition  2.1  is  a  version  of  Theorem  III4 
of  Lepeltier  and  Harchal  (1976),  in  the  sense  that 
they  considered  the  situation  of  O'-  Rn. 
Nevertheless,  for  the  proof  we  can  almost  follow 
Lepeltier  and  Marchal.  Another  recent  extension  of 
Theorem  III*  of  Lepeltier  and  Marchal  is  to  the 
situation  of  a  non-Lipschitzian  a  in  turn  of  a 
sufficient  non-degeneracy  assumption  on  6 
(Veretennikov,  1988). 


Proof: 

If  (l)'s  fourth  right  hand  term  vanishes,  then  it  is 
well  known  that  and  h±2.  are  sufficient 
conditions  (Gihman  and  Skorohod,  1972).  As  such,  we 
have  to  show  that  (l)'-s  fourth  right  hand  term  does 
not  change  that  situation,  under  A' .3.  A'. 4  and 
A'.  5. 

Due  to  A' .3  and  the  definition  of  Its  integration  a 
solution  of  (1)  is  CADLAG.  Due  to  A*. 5.  the 
discontinuities  in  (Et)'  that  are  caused  by  (l)'s 
fourth  right  hand  term,  are  countable.  Therefore  we 
can  associate  with  each  discontinuity  a  time,  T^, 


and  a  mul t i -variate  point,  u  ,  such' that 
0<T1<T2<.  .<T£<. .  and  |im  T^  «■  ®,  Due  to  the  latter 
and  {*(-)  being  CADLAG, 


6  *(«H-.«)PWt.du)  -  0<5 

If  (l)'e  first  three  right  hand  terms  vanish,  then 
the  latter  sum  is  finite  (a.s.)  for  all  tGR*,  due  to 
A' .4  and  A' .5.  With  this  result  it  is  sufficient  to 
show  that  (1)  has  a  patbwise  unique  solution:  on  an 
arbitrary  finite  time-interval  (0,T).  For  the 
existence  of  a  solution,  sea  the  proof  of  Th.  III4 
of  Lcpeltier  and  Harchal  (1976;  pp.  82-85).  Wo 
already  know  that  a  solution  is  unique  and 

Because  It  is  CADLAG  and  * 

-measurable.  Then,  by  the 
T^ 

definition  of  a  Poisson  random  measure  (Jacod  and 
Shiryaev,  1987,  pp 


v.*.*. 

^(.-measurable  on  (0,Ti). 
is  measurable,  T,  is  5_  - 

r  A  Te 


65-66)  u  is  9  -measurable  = 
T1  T1 

,um  )  is  9  -measurable  and,  due  to 


Ti:  ?1 


■*  Pathwise  uniqueness  holds  true  on 


T1  Tl" 

Li,  E^GO 

( 0 , Tx ]  and  t  60.  Due  to  the  latter,  we  can  repeat 

the  procedure1^  show  that  pr.thwise  uniqueness  holds 
true  on  [T1(T2]  and  5  60,  and  so  on  for  the 

countable  sequence  of  intervals.  Q.E.D. 


The  interesting  aspect  of  proposition  2.1  is,  that 
the  coefficients  of  (l)'s  fourth  right  hand  tern  may 
be  discontinuous  in  I.  This  is  exactly  what  wa  need, 
to  construct  a  class  of  hybrid  state  Harkov 
processes  that  is  larger  then  the  class  of  solutions 
of  systems  with  Markovian  switching  coefficients. 

The  first  step  towards  this  construction  is 
replacing  *(E,u)  by 

*'(E,u)  -  *(E,u)  X(  (Uj<A(E))  U  (F(E)vO)  ),  (2. a) 

where  F  is  a  measurable  mapping  of  Rn  into  (0,1),  * 
and  A  are  measurable  mappings  of  appropriate 
dimensions,  while  the  range  of  A  is  R,..  With  this 
(1)  becomes 

dEt  ”  a(It>dt  +  3(It)dwt  +  R_xRd  v'(?t-»u)  q(dt,du)  + 
+  R+iRd  *'(Et-'u)  P(dt,du) .  (2.b) 


Assumptions 

A. 3  Define  0'  s  (EGRn;  F(E)“0),  ... 

{«+*(*, u)J  6  O',  for  all  EGRn,  u6Rd+1. 

A" . 4  Given,  for  all  E6Rn-0'  and  u6R+xRd, 

*(«)“1, 

*(E,u)°*(E,u+Col(l,0,..,0)), 
and  for  any  k6N  there  exists  a  constant  Mk, 
such  that 

I  »(du)  <:  Mk,  for  all  EGBk. 


a*  A ( I )  is  on  0'  uniformly  bounded  and  continuous 
in  I . 

fe.  Ut)»  t6R+/  exits  0'  at  most  a  countable 
number  of  times. 

1+2 _ Ihe.prerc 

Given  m(du)«dUjX(i(dy)  and  assumptions  A.  1.  A,  2  .  A.  3  . 
A" . 4 .  A1* . 5  are  satisfied.  Then  equation  (2.a,b)  has 
for  any  EqGO'  a  pathwise  unique  solution  (Et>* 
Moreover  (Et)  is  then  a  Markov  process,  of  which  the 
sample  paths  are  measurable  on  the  stochastic  basis 
(a,?,r,P). 


Proof: 

Because,  on  O',  A (E)  is  continuous  in  E  (due  to 
A".5.a)  and  X(u^<A'),  A'GR,  defines  a  measurable 
mapping  of  R2  into  (0,1)  -  X{ux<A(E))  defines  a 
measurable  mapping  of  RxO'  into  (0,1).  Because  the 
range  of  F  is  (0,1),  we  can  write 

X(  (Uj<A(E)]  U  [F(E)*0]  )  -  x<  Uj.<A(E)  )  v  F(E)  , 
of  which  both  right  hand  terms  are  measurable.  This 
implies  that  the  supremum  is  measurable,  which 
combined  with  the  measurability  of  i>,  makes  that  f' 
is  measurable.  This  ensures  that  (2.b)  is  a  special 
case  of  (l),  with  f  replaced  by  +•  according  to 
(2. a).  With  this  we  are  left  to  verify  that  A. 3, 

AH. 4.  and  AH, 5  guarantee  that  A'. 3.  A*. 4  and  A'  .5  are 
satisfied,  which  is  straightforward.  Q.E.D. 


Having  theorem  2.2,  we  are  prepared  to  consider  a 
jump  reflecting  boundary  (in  sections  3  and  4)  and 
the  hybrid  state  space  situation  (in  section  5).  But 
first  we  give  a  strong  Markov  characterization  of 
(Et)  if  there  is  no  reflecting  boundary. 

_ Proposition 

Given  F  vanishes  everywhere  and  the  assumptions  of 
theorem  2.2  are  satisfied.  Then  for  all  EoSRn,  (Et) 
is  a  semi-martingale  strong  Markov  process,  and  its 
extended  generator,  4,  is  given  by: 

4f  -  ff  +  f-f  +  l+t  ,  for  all  f6C2'b(Rn),  (3) 

where 

*f(E)  -i81«i(Oftl(E)+*lfJ,.l(«(E)3(l)T]ijf8itj(t), 
7"f(E)  “Rn£(  0  >  C*  C«+C.>  -*C  t )  “i2xCif  1 1 (*  * 3  s“(?'dr)' 

,  1  (5) 

?+f(E>  V»£(0)tC(«+O-*(O]  S+(E,dC),  (6) 

and  for  all  Borol  ACRn-(0), 

S“(E,A)  aR_/Rd  X[  *<E,U)6A  )  m(du)  ,  (7) 

S+(E,A)  s  J|f)  £d  X(  ^(E ,u) 6A  ]  dux  *(dU).  (8) 

Proof: 

Due  to  A. 3.  A11 . 4 .  A". 5  and  0'«Rn,  the  Sf^-predictable 
part  of  Et  is 

At  “  £  “(Es)ds  +  jji  A(^s“)^d  *(Es-.«)  m(du)ds. 

Obviously,  (At)  is  of  finite  variation  on  any  finite 
time-interval,  while  (Ef"At)  is  a  local  martingale  ■» 
(Et)  Is  a  (special)  semimartingale  (Jacod  and 
Shiryaev,  p.<*3,  Def.  4.21).  This  immediately  implies 
that  (Et)  is  a  strong  Markov  process.  Because  (Et) 
is  a  semimartingale,  the  generator  A  follows  from 
ItO's  differentiation  rule  for  discontinuous 
semimartingales  (Elliott,  1982).  Q.E.D. 

1+ _ PIECEWISE  DETERMINISTIC  MARKOV. PROCESSES 

In  this  section,  we  represent  PD  Markov  processes  as 
solutions  of  an  SDE.  Therefore,  we  consider  (2.a,b) 
with  S“0  and  ^  vanishing  on  R”xRd; 
dEt  “  a(Et)dt  V^Rd  *»«:-#“>• 

•  X (  [Ui<A (E) ]  U  [F(E)*0]  )  p(dt,du) ,  (9) 

Our  goal  is  to  introduce  a  particular  mapping 
F:Rn-(0,l),  such  that  (9)  has  pathwise  unique 
solutions  which  are  PD  Markov  processes.  The  present 
definition  of  a  PD  Markov  process  (Davis,  1984) 
works  without  such  a  mapping  F.  Instead,  there  is 
given  an  open  subset  0  of  Rrt,  with  a  jump  reflecting 
boundary  30,  such  that  (Et)  instantaneously  jumps 
into  the  interior  of  0  just  before  it  would, 
otherwise,  cross  or  travel  through  30.  For  the 
definition  of  a  PD  Markov  process  from  (9)  an 
appropriate  F.  has  to  be  constructed  from  0  and  a . 

The  construction  of  F  will  be  based  on  the  following 
differential  aquation,  on  (0,®)xRn, 

dE't  -  a(E't)dt,  t£ (0 , ®) ,  (10) 

which  has  pathwise  unique  solutions,  assuming  that  a 
satisfies  conditions  A. 1  and  A. 2.  From  this,  we 
define  ££.  as  the  set  containing  all  elements  of  30 
that  are  directly  accessible  by  (E't)  fro®  05 
30  5  { EG3 0  1  3  re(0,®)  and  E'nGO  such  that 

E'r“E  A  E'r-e0).  (11) 

Next  we  introduce  the  following  distance  function, 

2  inf  (r20  ;  E'0”E  A  E'rGM),  (12) 

which  is,  under  the  above  mentioned  conditions  on  a, 
a  measurable  mapping  of  Rn  into  R.  With  this  we 
define,  for  iGN, 

0.  -  {EGO  ;  da(E ,iL2)  2  1/i),  (13) 

which  are  then  Borel  sets,  and  which  form  the  Borel 
set 

0's±yN  0i.  (14) 

Now  we  define  our  particular  F  as  follows: 

F(E)  -  1  ,  if  EGRn-0', 

-  0  ,  else.  (15) 

Due  to  the  above  construction,  F  is  measurable,  by 
which  theorem  2.2  yields: 

3.1  Corollary 

Given  an  open  subset  0  of  Rn,  and  a  mapping  F, 
defined  by  (10)  through  (15).  Then,  under  the 
assumptions  of  theorem  2.2,  equation  (9)  has  for  any 


t.060'  a  pathwise  unique  solution  { .  Moreover, 

( 1 1 )  is  then  a  Markov  process,  of  which  thG  sample 
paths  are  measurable  on  the  stochastic  basis 
(a,9, rtv). 

Next,  we  come  to  the  main  result  of  this  section, 
which  implies  that  (Et)  is  a  Piecewise  Deterministic 
Markov  process.  ’ 

1 

3  ...2- ..  ..Thgg.gam 

With  probability  one,  the  process  (Et),  of  corollary 
3.1,  exits  0U3 0  zero  times  on  (0,®). 

Proof;  , 

By  the  definition  of  F,  all  points  of  p  in  R*  become 
active  as  soon  as  (Et)  has  exit  O'.  This  situation 
holds  on  until  (Et)  reenters  O'.  The  reentering  may 
occur  due  to  drift  or  due  to  a  jump  generated  by  a 
point  of  p  in  R+.  Obviously,  the  cases  that  { Et) 
reenters  O'  by  drift  without  exit  of  QUao  do  not 
cause  any  difficulties.  In  all  other  cases,  the 
probability  of  exit  0UJJ1  by  drift  is 

J  exp (-s/r)  ds  -  r  exp(-K/r), 

with  r«inf(l/i  ;  ieu}  and  1/r  the  intensity  of 
points  of  p  in  R+.  Because  { e t )  exits  0'  at  most  a 
countable  number  of  times,  the  probability  of  exit 
0U3  0  at  least  once  is  then  r/K  exp (-K/r).  If  all 
points  of  p  in  R+  are  active,  then  because  KSH, 
lfg  r/K  exp{-K/r)  «  0, 

wnich  means  a  zero  probability  to  exit  OU&fl  on 
(0,®).  Q.E.D. 

3,3  Theorem 

The  process  { E t) ,  of  corollary  3.1,  is  a 
semimartingale  strong  Markov  process,  and  its 
extended  generator,  A,  is  given  by: 

4f  -  2f  +  7+f  ,  for  all  t£d(A), 
where  Z  and  ?+  are  given  in  proposition  2.3  with 
0-0,  while  the  domain  of  A  is: 

v(A)  -  (f  e  ci'b(0)ncb(0U2J2) ;  ?+f(E)»o,  all  E6i2). 

EtasX: 

Define  u  process  At  as  follows: 

At  »  l  a(Es)ds  +  |  X  •  Es_G0')  *(ES_,U). 

.m(du) ds  +  i|1  J  £d  ^(ESi_,u)  dUjX«(dn), 

with  S<  the  3 {.-adapted  times  that  (Et)  jumps  from 
Rn-0'  into  O',  i21  and  S0S0, 

si  “  |5g  (s  >  si-l  Eg-GRn-0'  A  EsG0'  ). 

obviously,  { A^- }  is  of  finite  variation  on  any  finite 
time-interval,  while  (Et-At)  is  a  local 
3t-martingale.  Subsequently,  { Et  >  is  a 
semimartingale.  Application  of  It6's  differentiation 
rule  for  discontinuous  (piecewise  deterministic) 
semimartingales  to  f(Et)»  with  f  G  C1,  yields: 

«(Et)  -  f(E0)  +  I  ~  C(Es-)  C<iCEU  + 

+  0<ht  R^Rd  Cf(«s-+HEe-,u))  -  f(Es.)  + 

“  ~  f(Eg_)  [*(«S-,U)i)  p(  {  S)  ,dU)  , 

up  to  indistinguishability. 

Substitution  of  A11. 4. 

p(ds,du)  ■  q(ds,du)  +  dsxa(du) , 
dEg  “  dAg  +  d(local  martingale), 
m(du)  -  dUiXKfdji) , 

and  using  f  G  cl>b(0)  n  c°f 0U30) .  yields 

f(Et)  -  f(E0)+i21  j  ~f(ts)[«(Es)]ids  +  l  X(Eg-G0') 

•A(JS_)  £d  (f(Es-+*(E8..,u))  -  f  ( Eg*.)  J  dsxduxx«(dU)  + 

+  iii  i  &  tf<v+*«Si-'u))  ■  e<vn  duiXM(du)  + 

+  d (local  martingale), 
up  to  indistinguishability. 

Hext  we  use  the  property  that 

7+2(E)  -  0,  all  EGM. 

Because  a  is  of  linear  growth  and  (Et)  is  locally 
bounded,  (a(Et))  is  locally  bounded.  This  implies 
that  (Et)  does  not  increase  while  travelling  through 
0-0'  to  a®  this  takes  a  time  interval  of  zero 
duration.  The  latter  and  the  assumptions  that 
tac°(0U2J2)  and  7+f(E)«0  for  all  EG££,  imply  that 
7+f(Es)-0  for  all  E6S0-0'.  With  this, 


f(Et)  "  f(Eo)  +  ff(Xa)  ds  +  d( local  martingale)  + 

+  l  M|a-)^d  (f(ta+f(E8,u))  ~  f ( E g) )  dsxduxxx (du) . 
Substitution  of  ?+  yields 
f (Et)  “  f(Eo)  +  jj  if(E0)  da  +  d(loca!  martingale), 

which  implies  that  (Et)  is  a  strong  Markov  process 
with  extended  generator  (A,  D(A} ].  Q.E.D. 


Having  obtained  PD  Markov  processes  as  solutions  of 
an  SDE,  the  next  step  is  to  include  diffusion. 
Therefore  wo  consider  the  following  SDE: 

dEt  “  «(Et)dt  +  0 (Et)dwt  +R+xRd  *(*t-'u)- 

•  X (  (U,<A(E))  u  [F(E)*0]  )  p(dt,du),  ( 161 
which  corresponds  to  (II. a, b)  if  jfe  vanishes  on  R~xRa. 
Initially  we  assume  that  0(E)0(EK  is  positive 
definite  for  all  EGRn,  but  relax  this  assumption 
further  on. 

How  we  construct  F,  starting  from  the  following 
differential  equation,  on  (0,®)xRn, 
dE't  “  ot(E't)dt  +  0(E't)dWt.  tG(0,®),  (17) 

which  has  pathwise  unique  solutions  under 
assumptions  A. 1  and  A. 2.  and  which  defines  a  family 
of  homogeneous  Markov  processes  with  a  measurable 
transition  function 

P' E (r,  A)  s  P(E'tGA|E'q-E),  all  Borel  A.  (18) 
Because  00T  is  positive  definite,  any  element  of  30 
is  accessible  by  (E't)  from  0.  Therefore  we 
initially  use  the  following  Euclidean  distance 
function, 

da (E,ao)  s  inf  ( |E-y|  ;  yeao),  (i9>) 

which,  obviously,  is  a  measurable  mapping. 

Hext,  we  define  the  Borel  sets  0;  as  follows, 

0:  -  (EGO  ;  dg(E,0O)  2  1/i),  iGH,  (20) 

and  from  this  the  Borel  set 

O's^  0i.  (21) 

As  before,  we  define  our  particular  F  as  follows: 

F(E)  -  1  ,  if  EGRn-0' , 

»  0  ,  else.  (22) 

Obviously,  F  is  measurable,  by  which  theorem  2.2 
yields: 

±U _ Coronary 

Given  an  open  subset  0  of  Rn,  and  a  mapping  F, 
defined  by  (17),  (18),  (19'),  (20),  (21)  and  (22). 
Then,  under  the  assumptions  of  theorem  2.2,  equation 
(16)  has  for  any  E0GO'  a  pathwise  unique  solution 
(Et)-  Moreover,  (Et)  is  then  a  Markov  process,  with 
sample  paths  being  measurable  on  the  stochastic 
basis  (0,3, F,P) . 

Hext,  we  come  to  the  characterization  of  the 
boundary  behaviour  and  the  strong  Markov  property  of 
(Et)- 

4 . 2  Theorem 

With  probability  one,  process  (Et)/  of  corollary 
4.1,  exits  0U30  zero  times  on  (0,®). 

Proof: 

By  the  definition  of  F,  all  points  of  p  become 
active  as  soon  as  (Et)  bas  uxit  O', say  at  moment  T, 
which  situation  continues  until  (Et)  bas  reentered 
O',  say  at  moment  T+A.  The  exit  may  occur  due  to 
diffusion  or  due  to  a  jump  generated  by  a  point  of  p 
in  R+.  Obviously,  the  cases  that  (Et)  exits  0-0'  by 
diffusion  without  entering  30  do  not  cause  any 
difficulties.  In  all  other  cases  we  know  from  the 
proof  of  theorem  3.3  that  A  has  an  exponential 
distribution  of  which  both  the  mean  and  the  standard 
deviation  equals  r~0+.  With  this,  it  follows  that, 
for  any  EGO',  the  probability  of  entering  and 
exiting  30  within  l/r  is; 

r_1  P'p(r,Rn-0-30)  £  r_1  Pt'(r,(yGRn  ;  |E-y|  >  K  )), 
with  K-inf (1/i  ;  iGH). 

Because  (Et)  is  a  diffusion  and  X>0,  the  right  hand 
side  is  of  order  r  (Gihman  and  Skorohod,  1972,  p. 

64).  As  this  situation  may  occur  a  countable  number 
of  times,  we  have  to  divide  by  K,  yielding  order 
(r/K),  of  which  the  limit,  rio,  is  zero.  Q.E.D. 

t 

Given  the  assumptions  of  theorem  4.2  are  satisfied. 
Then  for  all  E<)£0',  (Et)  i*  a  semimartingale  strong 
Markov  process,  and  its  extended  generator,  A,  is 


given  by!  - - 

•  if  -  tt  +  ?+f  ,  for  all  fes(i), 

where  t  and  7+  are  given  in  proposition  2.3,  while 

the  domain  of  4  is: 

s(i)  -  (f_e  c2<5(o)ncb(ouao>?  ?+f(E)-o,  all  eg4.o). 

£caa£i  Sinilar  to  the  proof  of  proposition  3.3, 
except  that  now  7+f(Ea)*>0,  for  all  £-60-0',  follows 
from  fec(ouao) .  q.e.d. 

Finally,  we  consider  the  nore  general  situation  with 
5(£)0(E)t  being  positive  seBidefinite.  The 
construction  of  F  works  according  to  equations  (17), 
(18),  (20),  (21)  and  (22),  but  with  distance 
functions 

d -  inf  (rjsO;  (ifi  n  Et  r)*{)  ),  (19) 

where  jjj)  is  the  subset  of  40  that  is  accessible  by 
(E't)  from  0,  ()  is  the  empty  set  and  Eg  -  is  the 
closure  of  an  n-dimensional  ellipcoid,  with  centre 
£+a (£) r  and  shape  defined  by  covariance  fl(tja(E)rr. 
Obviously,  da(.,ifi)  is  measurable,  by  which  the  Oj/s 
and  O'  are  Borel  sets  and  F  is  aeasurable,  and  we 
get: 


Lui . .c<?r.<2lUry. 

Given  an  open  subset  0  of  Rn,  and  a  mapping  F, 
defined  by  (17)  through  (22).  Then,  under  the 
assumptions  of  theorem  2.2,  equation  (IV)  has  for 
any  EnGO'  a  pathvise  unique  solution  (Et).  Moreover, 
(Et)  is  then  a  Markov  process,  with  sample  paths 
being  measurable  on  the  stochastic  basis  (a,9,?,P). 

Next,  we  come  to  the  main  result  of  this  section. 

4 . 5  Theorem 

With  probability  one,  the  process  (Et),  o£  corollary 
4.4,  exits  0U£2  zero  tines  on  (0,®). 

Proof:  , 

By  the  definition  of  F,  all  points  of  p  in  R+  become 
active  as  soon  as  (Et)  has  exit  O'.  This  situation 
holds  on  until  <Et)  reenters  O'.  The  reentering  may 
occur  due  to  drift  and/or  diffusion  or  due  to  a  jump 
generated  by  a  point  of  p  in  R  .  Obviously,  the 
cases  that  (Et)  reenters  0'  by  drift  and/or 
diffusion  without  exit  of  0UM  do  not  cause  any 
difficulties.  Of  those  cases  where  ifi  is  accessible 
through  drift  only,  we  follow  the  proof  of  theorem 
3.1.  Say  40g  is  the  subset  of  M  that  can  only  be 
entered  by  (E't)  due  to  drift.  For  all  other  cases 
we  than  notice  that  a  strictly  positive  type  (19) 
distance  dg  at  the  moment  of  exit  O',  corresponds 
with  a  strictly  positive  Euclidean  distance  from 
&£>-££*»  due  to  the  local  boundedness  of  |a(Et)l  and 
|B(Et)l.  Subsequently,  we  may  follow  the  proof  of 
theorem  4.2  for  these  cases.  Q.E.D. 

Ihspxaa 

Given  the  assumptions  of  corollary  4.4  are 
satisfied.  Then  for  all  tn£0',  (E^)  is  a 
seainartingale  strong  Markov  process,  and  its 
extended  generator,  4,  is  given  by: 

if  -  iff  +  ?+f  ,  for  all  fEs (i) , 
where  t  and  ?+  are  those  given  in  proposition  2.3, 
while  the  domain.  •->£  i  is: 

1 >(i)  -  (f  £  C*!  ;0)nc»(0Uia) ;  7+f (E)“0  all  E 6££). 

Proof:  Similar  to  the  proofs  of  theorem  3.3  and 
proposition  4.3. 

Sj—THE-  HYBRID  STATE  SPACE  SITUATION 

In  this  section  we  explicitly  consider  the  hybrid 
state  space  situation  for  a  system  of  the  form 
(2.a,b),  in  such  a  way  that  there  is  no  need  of 
assuming  a  particular  F  or  A.  As  such,  all  jump 
reflecting  boundary  results  of  the  former  sectior.3 
fit  Into  the  results  of  this  section.  For  ease  of 
notation  and  interpretation,  we  rewrite  the  SDE  form 
(2.a,b)  by  replacing  the  Poisson  random  measure,  p, 
by  a  multivariate  counting  process,  v*.,  such  that 
the  pathwise  uniqueness  of  (2) 's  solution  is 
preserved.  We  do  that  by  defining,  for  all  Borel 

’t(O)  “  l  l  *(  Cul<A  (Ea-)  ]  u  (i'(EB-)*0J  )  p(ds,du), 

(23. a) 

and  then  rewriting  (II)  as 

dEt  -  «<t:t)dt  +  0(Et)dwt  +R-/Rd  <I(dUdu)  + 


+  R+£nd  dyt(du)  •  (23. b) 

The  main  objective  of  this  section  is  to  show  that 
the  last  term  of  (23. b)  generates  a  particular  type 
of  jump:  a  jump  in  (it)  that  anticipates  a 
simultaneous  switching  of  {E1^).  For  short  wo  refer 
to  this  type  of  jumps  as  hybrid  jumps.  Notice  that 
these  hybrid  jumps  are  in  soma  sense  unexpected,  as 
all  coefficients  of  (23. a, b)  are  non-anticipating. 

To  show  these  hybrid  iumP3  explicitly,  wo  need  some 
preparation. 

5 1 1  Lemma 

Under  assumptions  A.l.  A. 2.  AM . 3 .  A" ..4  and  A" .5.  the 
pair  of  equations  (23. a, b)  has  for  any  Eo£0'  a 
pathwise  unique  solution  (Et(yt)>  wbore  y,t  a 
multivariate  counting  process  on  R+XR^xR"  of  a 
predictable  intensity,  AfcSA(Et-)"  Moreover  both 
(Et»vt)  and  (Et)  are  semimartingale  strong 

Markov  processes,  of  which  (Et)  is  indistinguishable 
from  the  one  in  theorem  2.2. 


Proof: 

It  follows  from  theorem  2.2,  that  the  system  of 
equations  (2.a,b)  and  (23. a)  has,  for  any  Borel  U,  a 
pathwise  unique  solution  {Et,ytM  )•  With  this, 
system  (2.a,b),  (23. a)  has  a  pathwise  unique 
solution  (Eti^t)*  Obviously  all  potentially  active 
points  of  p,  that  are  in  R+xR+xRa,  are  collected  by 
*t  in  a  predictable  way,  by  which  we  can  write 

R+'Rd  *(«*•»'»>  X<  C»i<A<Et-»  u  £F(Et->*0]  )• 

.p(dt,du)  -  R+/Rd  *(Et-,u)  dvt(du)  , 

up  to  indistinguishability.  This  implies  that  the 
solution  of  (2.b)  is  indistinguishable  from  the 
solution  of  (23. b).  Q.E.D. 

Now  we  are  prepared  to  consider  the  hybrid  state 
space  situation.  Therefore  we  assume  that  the  first 
component  of  Et  is  M-valucd,  with  HCNs ( 1, 2, . . } ,  and 
that  we  can  write  the  first  scalar  equation  of 
(23. b)  as  follows: 

dExt  -  R+/Rd  *i(Et-,u>  dct(du),  (23. c) 

with  a  mapping  of  RnxR+xRd  into  the  integer 
lattice,  Z. 

Next  we  assume  that  *  satisfies,  for  all  Uj6(0,A(E)J, 

*(e'u)“h£m  X[  iSoV(i'£)  2  U1  < 

.  (24) 

whore  v  is  a  measurable  mapping  of  MxRnxRa  into 
ZxRn“l,  and  1  is  a  measurable  mapping  of  NxRn  into 
R+,  such  that  X (i , . ) “0  for  all  i6N-M,  and 

iIn  x(1'e>“  a(e>. 

Moreover,  we  assume  that  for  all  n6N,  E6ZxRn_1  and 
U6Rd, 

(25) 

for  all 

~Qtn,  men  itAt)  R+5di- 
Substitution  of  (24)  and  (25)  in  (23.b,c)  and 
subsequent  evaluation  yield 

dElt  "  &  nh.  x[  ilix(i'Et-)  *  u*t-  <  Joxa»«t-)]- 

.  • (t-E1t-)  dyt(duxXRd),  (26. a) 

with:  u*s  =  Ui^-kAtEg), 

for  some  integer  k  such  that  0  <  u  -  £.  A(ES), 
dlt  "  a(«t>dt  +  £(*t)dwt  VxRd  *(Et-'u>  q(dt,du)  + 
+  (d  dvt(R+'Xdu),  (26. b) 

yt(U)  “  l  £  X(  Cu1SA(E8_))U(F(Es_)'!0]  )  p(ds,du) , 

"  (26. c) 

all  Borel  UCR+xRd,  where  underlining  of  a  vector 
refers  to  all,  but  the  first,  components  of  that 
vector. 

Ass.unig.Uong. 

A.  4  Given,  for  all  E6Rn-0'  and  uSR+xRd, 

A (E)  -  g^C.E)  -  1, 

«’l(t,E,u)  -  n-Ex, 
m(du)  -  dUjXR(du). 

For  all  k6N  there  exists  a  constant  Mjj,  such 
that  for  all  EGBi,. 

1(1, E)  [In-Eil  +  £d  |S»(n,E,U)  I  ^(d!l)]  *  Mk* 


*i(i,e,vj)  -  n-Ei  , 

which,  together  with  (24)  and  X(i,.)»0 
i6N-K,  implies  that  if  E1oGM,  then  (Ext 


a-  For  ill  *60', 

X(i,l)  is  continuous  in  *, 

X(i,E)-0  for  all  i6N-H, 

A(U  5  X  ( i ,  t )  is  uniformly  bounded. 

fe*  (ft)/  t6R|_,  exits  0'  at  most  a  countable  number 
of  times. 

£i_2 _ Iheacea 

Given  the  hybrid  space  O'  s  0'n(KxRn-1) . 

Under  assumptions  AU.  through  the  system  of 
equations  (26.a,b,c)  has  for  any  in60'  *  pathwise 
unique  solution  {lt,»t}.  Moreover  {tt}  is  then  a 
semimartingale  strong  Markov  process  in  R^.xO'. 

Proof: 

Duo  to  A. 3  and  A. 5. a.  (24)  defines  *  as  a  measurable 
mapping  (see  proof  of  theorem  2.2),  by  which 
(26.a,b,c)  is  a  special  case  of  (23.a,b,c).  Next  we 
show  that  A. 4  implies  A" ■ 4 ,  by  which  lemma  5.1  and 
(24)  imply  that  the  solution  of  (23.a,b,c)  is 
indistinguishable  from  the  solution  of  (26.a,b,c). 

To  arrive  at  A".,.4  ■  we  start  from  AjJ.  and 
subsequently  use  A. 5. a.  interchange  order  of 
integration  and  substitute  (24).  Q.E.D. 

Due  to  its  extensive  form,  equation  (26.a,b,c)  hidos 
the  results  for  which  the  above  analysis  has  been 
carried  out.  Therefore,  we  take  a  closer  look  at  it 
in  case  that  p  has  no  points  in  R“.  Then,  (26.b) 
becomes 

dlt  -  a(Et)dt  +  £(*t)dvt  +  £d  2a1t'Et-'U)dl't(R+xda) 

(27. a) 

Moreover,  to  avoid  the  use  of  equations  (26. a, c) ,  we 
go  over  to  the  common  descriptive  way  of  formulating 
{►t>  and  ( C1t ) : 

(vt)  is  a  multivariate  counting  process 
characterized  by  the  ?t-predictable  intensity,  rt, 

Ft  -  AUt-)  [1  +  F(tt-)  If g  1/r],  (27. b) 

and  a  deterministic  jump  measure  A(dq). 

(E1^}  is  a  process  with  a  countable  state  space,  N, 
and  with  an  3*.-predictable  rate,  r;i  *■/  of  jumping 
from  t*t_-j  tS  tVi,  i*j,  13,t 

rij,t  s  Mi,(j.It->)  C1+  F(wlt-)  1/T]'  (27,c) 

while  rij(t  S  rt. 

From  this  formulation,  we  easily  notice  the 
interesting  effect  that  appears  in  the 
coefficient,  »,  of  (27. a) 's  third  right  hand  term. 
This  means  that  Et-<U)  anticipates  a  switching 

from  £■*■£_  to  E1*/  and  thus  a  jump  of  (J.*) 
anticipates  a  simultaneous  transition  of  { e }  • 
Verify  that  the  anticipating  coefficient  z  already 
appears  in  (26.a,b,c),  while  there  is  no 
anticipating  coefficient  in  equation  (23.a,b,c).  As 
the  solutions  of  both  equations  are 
indistinguishable,  we  conclude  that  (23.a,b,c)  is 
the  canonical  representation  of  a  system  with  hybrid 
lumps,  while  (26.a,b,c),  with  the  anticipating 
coefficient,  £3  the  representation  that  is  more 
useful  when  it  comes  to  the  realization  of  Markov 
models  with  hybrid  jumps. 

Remark?  If  X(.,(tx,l))  is  i-invariant,  then  (E1t>  is 
a  countable  state  Markov  process.  In  this  case 
(27. a)  can  straightforwardly  be  obtained  from  a 
classical  system  like  (1)  of  which  all  coefficients 
are  continuous.  For  the  situation  that  (1^)  is 
continuous,  l.e.  »“0,  see  Brockett  and  Blankenship 
(1977).  For  some  applications  with  hybrid  jumps, 
i.e.  »*o,  sea  Sworder  (1972),  Blom  (1984)  and 
Mariton  (1987). 


The  author  is  grateful  to  Professor  Yaakov 
Bar-Shalom  for  stimulating  discussions  and  his 
hospitality  at  the  University  of  Connecticut. 
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Abstract 


A  realistic  stochastic  control  problem  for  hybrid  systems  with  Markovian 
jump  parameters  can  have  the  switching  parameters  in  both  the  state  and 
measurement  equations.  Furthermore,  both  the  system  state  and  the  jump  states 
are,  in  general,  not  perfectly  observed.  Currently  there  are  only  two  existing 
controllers  for  this  problem.  One  is  based  upon  a  heuristic  multiple  model 
partitioning  (MMP)  and  hypothesis  pruning.  The  other  utilizes  the  entire 
future  tree  of  models,  and  is  called  the  Full-Tree  (FT)  controller.  The 
performance  of  the  latter  is  superior  to  the  former  and  their  complexities  are 
similar.  In  this  paper  we  present  a  new  stochastic  control  algorithm  for 
stochastic  systems  with  Markovian  jump  parameters.  This  control  algorithm  is 
derived  through  the  use  of  stochastic  dynamic  programming  and  is  designed  to  be 
used  for  realistic  stochastic  control  problems,  i.e.,  with  noisy  state 
observations.  This  new  scheme,  which  is  based  upon  the  interaction  of  r  (the 
number  of  models)  model-conditioned  Riccati  equations,  has  a  natural 
parallelism  and  is  straightforward  to  implement.  The  state  estimation  and 
model  identification  is  done  via  the  recently  developed  Interacting  Multiple 
Model  algorithm.  Simulation  results  show  that  a  substantial  reduction  in  cost 
can  be  obtained  by  this  new  control  algorithm  over  the  MMP  scheme. 

Furthermore,  the  performance  of  the  new  algorithm  is  shown  to  be  practically 
the  same  as  that  of  the  FT  scheme  even  though  the  new  scheme,  which  has  a  fixed 
amount  of  computations  at  each  step  of  the  recursion,  is  much  simpler  to 
implement  than  both  the  MMP  and  FT  algorithms. 
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1.  Introduction 

An  important  problem  of  engineering  concern  is  the  control  of  discrete-time 
stochastic  systems  with  parameters  that  may  switch  among  a  finite  set  of 
values.  In  this  paper  we  present  the  development  of  a  new  controller  for 
discrete-time  hybrid  jump-linear  Gaussian  systems.  Here  the  state  and 
measurement  equations  have  parameter  matrices  which  are  functions  of  a  Markov 
switching  process.  The  jump  states  are  not  observed  and  only  the  system  state 

i 

is  observed  in  the  presence  o.  noise. 

This  new  controller  has  control  gain  coefficients  that  can  be  generated 
off-line  and  is  designed  to  be  real-time  impiementable.  It  belongs  to  the 
open-loop  feedback  tOLFl  class  [B3]  -  incorporation  of  the  dual  effect  would 
have  precluded  the  above  two  rather  important  features.  To  date,  there  is  no 
dual  (closed-loop)  controller  for  jump-linear  stochastic  systems  with  noisy 
observations.  Some  preliminary  work  along  these  lines  has  been  reported  in 
f  C  3 1 . 

In  addition  to  presenting  a  practical  control  algorithm  we  also  point  out 
an  interesting  theoretical  phenomenon.  We  show  that  there  is  a  natural 
connection  between  the  Interacting  Multiple  Model  (IMM)  state  estimation 
algorithm  [Bl,  B5]  and  the  control  of  jump-linear  systems.  Thus  the  IMM  is  the 
state  estimation  algorithm  of  choice  for  use  in  these  types  of  control 
problems. 

Systems  which  belong  to  the  jump-linear  class  are  found  in  many  areas. 
Systems  of  a  highly  nonlinear  nature  can  be  approximated  by  a  set  of  linearized 
models  [Ml,  VI,  V2].  A  failure  in  a  component  of  a  dynamical  system  (or 
subsequent  repair)  can  be  represented  by  a  sudden  change  in  the  systems 
parameters  [B2,  SI,  Wl],  Also  economic  problems,  which  can  be  modelled  by 
parameters  that  are  subject  tc  sudden  changes  due  to  shortages  in  important 
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materials  [Gi]  belong  to  this  class.  And,  as  is  noted  in  [M2],  there  also 
exist  applications  to  the  design  of  control  systems  for  large  flexible 
structures  in  space. 

There  has  been  an  extensive  amount  of  work  done  in  this  area  and  on  the 

related  problem  of  controlling  stochastic  dynamic  systems  with  unknown, 

time-invariant  parameters.  We  refer  the  reader  to  [Til  and  [Gl]  for  a  list  of 

references  and  a  discussion  of  their  scope  and  applications. 

* 

More  recently  in  [S2]  a  feedforward/feedback  controller  was  presented  for 
the  continuous-time  problem  with  a  completely  observed  system  state  and  where 
the  "modal  indicator"  is  measured  with  a  high  quality  sensor.  In  [M2]  the 
continuous-time  jump-linear  problem  is  considered  where  the  system  state  and 
"modal  processes"  are  perfectly  observed.  The  optimal  regulator  was  obtained 
and  notions  of  stochastic  stabilizability  and  detectability  were  introduced  to 
characterize  the  behavior  of  the  optimal  system  over  long  time  intervals.  In 
[M3]  the  continuous-time  jump-linear  problem  with  additive  and  multiplicative 
noises  and  noisy  measurements  of  the  plant  state  was  considered  with  the  plant 
mode  assumed  to  be  perfectly  observed. 

A  sufficient  stability  test  was  given  in  [El]  for  checking  the  asymptotic 
behavior  of  the  error  introduced  by  the  averaging  of  hybrid  systems,  in  [M4] 
the  continuous-time  jump-linear  problem  with  non-Markovian  regime  changes  was 
considered.  A  control  scheme  was  presented  for  the  case  of  perfect 
observations  of  the  system  state  and  plant  regime. 

In  [Cl]  a  discrete-time  Markovian  jump  optimal  control  problem  was 
considered.  The  controller  is  for  the  case  of  perfect  system  state 
observations  and  known  form  process  (mode).  They  derived  necessary  and 
sufficient  conditions  for  the  existence  of  optimal  constant  control  laws  which 
stabilize  the  controlled  system  as  the  time  horizon  becomes  Infinite.  Through 
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examples  they  showed  the  interesting  result  that  stabilizability  of  the  system 
in  each  form- is  neither  necessary  nor  sufficient  for  the  existence  of  a  stable 
steady-state  closed-loop  system. 

In  l Yl]  a  discrete-time  system  with  perfect  state  and  mode  information  was 
considered.  A  controller  was  presented  which  is  stabilizing  in  the  mean  square 
exponential  sense. 

As  pointed  out  in  [Gl],  we  generally  cannot  determine  the  optimal 
jump-linear  quadratic  Gaussian  closed-loop  control  law  analytically  even  for  a 
two-step  problem.  In  order  to  compute  the  optimal  control,  extensive  numerical 
search  methods  must  be  employed  and  thus  one  would  like  to  find  simpler 
suboptimal  control  schemes. 

Currently  there  exist  two  implementable  controllers  for  this  problem 
(switching  parameters  in  the  system  state  and  measurement  equations  and  noisy 
state  observations).  One  of  them  is  the  one  discussed  in  [Til  and  is  of  the 
OLF  class.  This  algorithm  is  based  upon  a  heuristic  multiple  model 
partitioning  (MMP)  and  hypothesis  pruning.  The  other  one  is  the  Full-Tree  (FT) 
scheme  developed  in  [C21. 

The  MMP  approach,  being  conceptually  simple  and  straightforward  to 
implement,  is  a  reasonable  choice  for  the  time-invariant  unknown  parameter 
problem  [LI],  and,  as  shown  in  [Tl],  it  works  well  for  applications  involving 
switching  parameters  in  the  state  measurement  equation  only.  For  the 
non-switching  parameter  problem  the  operating  mode  is  determined  to  a  high 
probability  in  a  relatively  short  period  of  time  and  then  the  MMP  approach 
gives  the  linear  quadratic  Gaussian  optimal  control. 

For  switching  parameter  problems  a  different  situation  exists.  Because  of 
switching,  the  operating  mode  may  never  be  determined  with  high  probability. 
The  approach  taken  here  to  derive  a  suboptimal  control  scheme  is 
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to  start  with  the  stochast'c  dynamic  programming  formulation.  By  utilizing 
dynamic  programming  and  making  appropriate  suboptimal  assumptions,  a  recursion 
is  derived  and  the  use  of  numerical  search  methods  has  been  avoided.  We  thus 
have  developed  a  multiple  model  control  scheme  which  has  the  following 
desirable  properties:  (a)  it  gives  the  optimal  last  stage  control,  f b)  it 
utilizes  the  1MM  state  estimation  scheme,  (c)  it  has  the  same  property  as  the 
MMP  and  FT  controllers  in  that  it  gives  the  optimal  linear  quadratic  control 
under  the  assumption  of  a  perfectly  known  model  history  sequence  (which  is, 
however,  an  unrealistic  assumption  for  this  class  of  problems),  and  (d)  it  is 
implemented  naturally  using  parallel  processors. 

For  comparison  purposes  we  implement  the  "switching  parameters  in  the 
system  state  equation"  controller,  proposed  (but  not  tested)  in  (T1J,  and  the 
FT  scheme  of  (C2J.  We  show  via  examples  that  a  statistically  significant 
reduction  in  cost  can  be  achieved  through  the  use  of  our  controller  over  the 
MMP  scheme.  Also  our  new  algorithm  is  shown  to  have  practically  the  same 
performance  as  the  FT  r>  ntroller,  which  was  shown  in  [C2]  to  be  significantly 
superior  to  the  MMP  algorithm.  But,  since  our  new  algorithm  has  a  fixed  amount 
of  computations  for  each  step  of  the  backwards  recursion,  as  compared  to  the 
exponentially  growing  amount  of  computations  for  the  FT  scheme,  it  is  much 
simpler  to  implement. 

The  paper  is  outlined  as  follows.  In  Section  2  the  problem  formulation  is 
given.  In  Section  3  the  connection  between  the  IMM  state  estimation  algorithm 
and  the  control  of  multiple  model  systems  is  shown.  In  Section  4  we  derive  the 
new  control  scheme  which  is  suitable  for  real-time  implementation.  In  Section 
5  we  use  simulations  to  compare  the  MMP  control  algorithm  with  the  FT 
controller  and  with  our  recursive  real-time  implementable  scheme. 
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2.  Problem  Formulation 

The  problem  to  be  solved  is  discussed  next.  We  took  the  pragmatic  approach 
of  starting  with  the  available- mathematical  and  statistical  tools  Found  to 
yield  success  in  solving  similar  problems  of  this  type  in  the  past  (i.e.,  use 
is  made  of  the  stochastic  dynamic  programming  method  and  the  total  probability 
theorem,  etc.}.  As  we  shall  see,  not  only  does  this  practical  engineering 
approach  yield  an  improved  multiple  model  control  algorithm,  but  it  also  leads 
to  the  interesting  theoretical  observation  of  a  direct  connection  between  the 
IMM  state  estimation  algorithm  and  jump-linear  control. 

It  is  desired  to  find  a  sequence  of  causal  control  values  to  minimize  the 
cost  functional 

N~1 

J  =  e{c(0)}=e{x(N)'Q(N)x(N)+I  [x(k)'Q(k)x(k)+u(k)'R(k)u(k)|]  (2.1) 

where  Q(k)Z0  for  each  k=0,l,...N  and  and  it  is  sufficient  that  R(k)>0  for 
each  k=0,l . N-l. 

The  discrete-time  system  state  and  measurement  modeling  equations  are 

x(k)  =  F[M(k)]x(k-l)  ♦  G(H(k)]u(k-l)  *  v[k-l,M(k)]  (2.2a) 

.  z(k)  =  H[M(k)]x(k)  ♦  w(k,M(k)]  k=0,l,2,...  (2.2b) 

where  x(k)  is  an  nxl  system  state  vector,  u(k)  is  an  pxl  control  input, 
and  z(k)  is  an  mxl  system  state  observation  vector.  The  argument  M(k) 
denotes  the  model  "at  time  k"  -  in  effect  during  the  sampling  period  ending  at 
k.  The  process  and  measurement  noise  sequences,  v[k-l,M(k)I  and  w[k,M(k)J,  are 
white  and  mutually  uncorrelated. 

The  model  at  time  k  is  assumed  to  be  among  a  finite  set  of  r  models 

MCk)  e  (1,2 ,...,r)  (2.3) 


for  example 
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F[M(kH)  «  Fj 

(2.^) 

v[k-l,M(k)=j]  ~  iV(p.,V.] 

(2.5) 

w[k,M(k)=j]  ~  Aak-.Wj.l 

(2.6) 

i.e.,  the  structure  of  the  system  and/or  the  statistics  of  the  noises  might  be 
different  from  one  model  to  the  next. 

The  model  switching  process  to  be  considered  here  is  of  the  Markov  type. 

t 

The  process  is  specified  by  a  transition  matrix  with  elements  p„.  Let 

Ik  £  (z(0),z(l) . z(k),u(0),u(l) . u(k-l)]  (2.7) 

denote  the  information  available  to  the  controller  at  time  k  (i.e.  the  control 


is  causal). 
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3.  The  Last  Stage  Control  and  the  IMM  Estimator 

An  integral  part  of  any  control  algorithm  for  this  class  of  problems  is  the 
system  state  estimator.  In  this  section  we  show  that  there  exists  an 
interesting  connection  between  the  control  of  multiple  model  stochastic  systems 
and  the  IMM  system  state  estimator  [Bl,  B5 1.  To  this  end  we  start  by  solving 
for  the  optimal  control  at  time  N-i  .  The  optimal  control  at  time  N-l,  is  the 
value  of  u( N— 1 )  which  minimizes 

J(N-l)  =  e{x{N-1)'Q(N-1)x(N-1)+u(N-1)'R(N-1)u(N-1)+x(N)'Q(N)x(N)|in"1}  . 

=  i  e[x(N-1)'Q(N-1)x(N-1)4-u(N-1)'R(N-1)u(N-1) 

j=l  1 

♦  x(N)'Q(N)x(N)|lN~\M(N)=j}p{M(N)=j|IN~1}  (3.1) 

Define 

p.(N|N-l)  £  P { M ( N ) = jl IN-l>  (3.2) 

and  use  the  state  equation  (2.2a)  and  (2 .4),  (2.5)  in  (3.1)  to  get 

J(N-l)  =  £  E{x(N-l),[Q(N-l)+Fj'Q(N)Fj]x(N-l)+2u(N-I)'Gj'Q(N)Fjx(N-l) 
+u(N-l)'[R(N-l)+Gj'Q(N)Gj]u(N-l)|lN”1,M(N)=j|p.(N|N-l) 

+  S  tr[Q(N)V.]p.(NlN-l)  (3.3) 

j=i  1 

Now  taking  the  gradient  of  (3.3)  w.r.t.  u(N-l)  and  setting  it  to  zero  yields 
u*(N-l)  =  -[RCN-lh£  GjQCNJGjU .(NlN— 1) ]  * 

•  £  GjQ(N)FjE{x(N-i)|lN-l,M(N)=jJpj(N|N-l)  (3.4) 
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Notice  that 

e{x(N— 1)J ( N ) = j}  =  i  E{xCN-l)jlN-1,M(N)=j.M(N-l)*i} 

♦  P(M(N-l)=i|M(N)=j,IN-1} 

where,  since  M(N)=j  in  the  first  conditioning  is  irrelevant  (see  for 
[BID,  the  expectation  inside  the  summation  is 

E{x(N-l)|iH"\M(N)=j,M(N-l)  =  i}  =  e{  x  CN-1 )  |  )=i } 

=  Xj(N-llN-l) 

Using  the  following  definition 

HUj(k|k)  ^  P{M(k)=i|M(k+l)=j,lk) 
and  (3.6)  in  (3.5)  yields 

E{x(N-l)|lN_1tM(NH)  =  t  xi(N-l|N-l)p..|j(N-l|N-l) 

£  x°j(N-liN-l) 

which  is  the  1MM  mixed  initial  estimate  [Bl,  B5). 

Thus  using  (3.8)  in  (3.4)  we  get 

u*(N-l)  =  -[R(N-l)*SG}«N)|i.(N|N-l)|’1 

1  j:l  J  J  1 

♦  ZG'iQ(N)Fix0j(N-l|N-l)p.(N|N-l) 

i=i  1  J  J 


(3.5) 
example 

(3.6) 

(3.7) 


(3.8) 


(3.9) 
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4.  The  Parallel  Control  Algorithm  (PCA) 

In  the  following  we  will  derive  a  backwards  recursive  method  for  obtaining 
control  gain  parameters  that  minimize  an  approximation  to  J*(k,Ifc),  which 
is  defined  to  be  the  optimal  cost-to-go  from  time  k  to  the  end.  Using  this 
definition  and  (2.1)  the  Bellman  equation  is  written 

J‘(k,Ik)  £  min  E{x(k)'Q(k)x(k)+u(k)'R(k)u(k)  +  J*(k+l,Ik+1)|lk)  (4.1) 

t 

The  method  is  based  upon  the  backwards  propagation  of  r  model  conditioned 
matrix  Riccati  equations.  Each  of  these  r  equations  is  propagated  by 
utilizing  a  probabilistic  combination  of  the  Riccati  matrices  obtained  in  the 
prior  iteration.  Thus  at  each  iteration  of  the  backwards  recursion 
computational  requirements  are  fixed  and  the  scheme  has  a  natural  parallelism. 

In  order  to  obtain  a  control  based  upon  the  propagation  of  r  Riccati 
equations  we  first  denote  the  optimal  cost-to-go  at  time  k+1,  given  that 
M(k+2)=i,  as 

Ji*(k+l1Ik+1)  ~  jmin  E{x(k+l)'Q(k+l)x(k+l)+u(k+l)'R(k+l)u(k+l) 

+  J'(k+2,Ik+2)|lk+1,M(k+2)=i}  (4.2) 

The  conditioning  with  a  time  k+2  model  is  used  because  this  is  the  model 
starting  immediately  after  time  k+1  -  see  (2.2).  These  model-conditioned  costs 
are  used  to  approximate  the  optimal  cost-to-go  in  (4.1)  as  follows.  The  total 
probability  theorem  is  used  as  follows 

E{j,(k+l,ik+1)|lkj  =  £  E{j,(k+l)Ik+1)|M(k+2)=i,Ik)p{M(k+2)=i|Ik)  (4.3) 

We  obtain  our  approximation  of  (4.3)  by  replacing  the  optimal  cost-to-go,  which 
is  a  minimization  of  an  expectation  which  has  "smoothed  out"  M(k+2),  with  the 
model-conditioned  costs-to-go  [which  has  M(k+2)  in  the  conditioning  as 
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indicated  in  (4.2)]  as  follows 


S  e{  J*(k+l,lk*1)|M(k+2)*i,lk}p{M(k+2)*i|lk} 

«  S  E{jj(k+i,Ik+1)|H(k+2)=i,lkJp{H(k+2)=i|ik) 

i=i 

Using  (4.4)  and  (4.3)  in  (4.1)  one  obtains 

J*(k,!k)  «  mj_y  E{x(k)'Q(k)x(k)  +  u(k)'R(k)u(k)|lk} 

♦  i  E{j|(k+l,lk+1)|M(k+2)=i1lk]p{M(k+2)=i|Ik} 
i=l  '  ' 

The  cost-to-go  from  k+1  starting  with  M(k*2)=i  is 

J[(k+l,Ik+1)  «  umi+n  E{x(k+l)'Q(k+l)x(k+l)  +  u(k*l)'R(k+l)u(k+l) 

♦  J*(k+2,Ik+2)|!k+1,M(k^2)=i} 


umi+n  E{x(k+l)'Q(k+l)x(k+l)  ♦  u(k+l)'R(k*l)u(k+l)|lk+1,M(k+2)=i} 

♦  t  e( JUk+2,ik+2)|ik+l,M(k+2)=i,M(k+3)=j} 
j=i  *  J  1  ' 


P{M(k+3)=jjM(k+2)=i,P+1} 


where  a  similar  method  to  the  approximation  technique  that  led  to  (4.5)  was 
used  to  obtain  (4.6). 

in  order  to  obtain  a  recursion  one  can  make  the  following  assumptions 
J](k+2,lk+2)  *  E{x(k+2)'Pj(k+2)x(k+2)|lk+2,M(k+3)=j|  +  a^k+2)  (4.7) 


and 
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i  E  E[x(K+2)'Pj(k+2)x(k+2)|lk+2,  M(k+3)=j]  lk+1,M(k+2)=i,M(k+3)=jjp.. 
~  E{x(k+2)'[XPj(k+2)p..]x(k+2)|lk+1,M(k+2)=i} 

=  E{x(k+2)'S.(k+2)x(k'*-2)|lk+1,M(k+2)!!i} 


where 


S,(k*2)  £  I  Pj(k+2)p.. 

‘  jTi  J  y 

and  furthermore  assume  a.(k+2)  is  independent  of  u(k+l).  With  (4.7), 
(4.6)  becomes 

J|(k+l,lkn)  «  jmin  E{x(k+l)'Q(k+l)x(k+l)  ♦  u(k+l)'R(k+l)u(k+l)|lk+1,M(k+2)=i} 

+  i  E{E[x(k+2)'Pj(k+2)x(k+2)|ik+2,M(k+3)=j] 


aj(k+2)|iktl,M(k+2)=i,M(k+3)=j}p.. 


(4.10) 


Now  using  (4.8)  and  (2.2a)  in  (4.10),  and  taking  the  gradient  w.r.t.  u(k+l) 


and  setting  to  zero  yields 


u.(k+l)  »  -  [R(k+1)  +  G[Sj(k+2)G.]  1G(S,(k*2)F,  x0i(k+l) 


(4.11) 


where  once  again  we  see  the  IMM  mixed  initial  estimate  showing  up.  Thus,  using 


(4.11), 


Jj(k+l,lk+1)  s  E|x(k+l)'P;(k+l)x(k+l)|lk+1,M(k+2)  =  i}  ♦  «j(k+l)  (4.12) 
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where 


P.(k+1)  £ 


Q(k+1)  ♦  Fj[S-(k+2)  -  S,(k*2)G,[R(k*l)  ♦  ^.(k^JG;]"1 


G|Sj(k+2)]Fj 


P;(N)  =  Q(N) 


i=l . r  (4.13) 


(4.14) 


«j(k+l)  =  trl(5.(k+l)]  ♦  £  E{aj(k*2)|lk+\M(k+2M,M(k«*3)-j}  i=l . r  (4.15) 


(4.16) 


i=l,...,r  (4.17) 

(4.18) 


a.(N)=0 

P,(k+1)  ^  Sj(N-l)Vj  ♦  F[  Sj(N-l)Gj[R(k+l)  ♦  G|Sj( k+2)Gj] 

•  GjSj(k+2)F;  Zoi(k+l|k+i) 

Z0i(k+l|k+l)  £  Cov[x(k+l)|lk+1,M(k«-2)=i] 

We  can  see  now  that  the  assumption  that  a.  is  independent  of  u, 

i=l,2 . r,  was  made  to  avoid  the  implications  of  the  dual  effect.  Note  that 

the  coefficients  (4.13)  and  (4.14)  are  computable  off-line. 

Using  (4.12)  in  (4.5)  (see  Appendix)  and  solving  for  the  control  which 
minimizes  the  approximate  cost,  one  obtains 


uPCA(k)=  - 


R(k)+2[SGjPi(k+l)Gjp.|.(k+l|k)]p{M(k+2)=i|lk) 


«  *  j 


•  ?[SG;Pi(k+l)Fjx0j(k)pj!.(k*l|k)]p(M(k+2)=i)|ik)  k=0 . N-2  (4.19) 

where  the  cost  matrices  follow  from  (4.13)  and  (4.9).  Also  note  that  the 
probabilities  P{M(k+2)*ii|Ik}  are  calculated  using  the  elements  of  the  Markov 
transition  matrix  and  the  time  k  conditional  model  probabilities  (see 


Appendix). 


(C8con)(890907) 


14 


Summarizing,  the  controller  is  given  by  (4.19)  for  all  but  the  last  period, 
for  which  it  is  given  by  (3.9). 

We  note  that  the  resulting  algorithm  is  real-time  implementable:  its 
complexity  is  linear  in  the  number  of  models  compared  to  a  standard  LQ 
controller.  We  also  note  that  there  is  a  similarity  between  the  form  of  (4.19) 

and  that  of  the  optimal  linear  quadratic  control,  and  that  x0j  is 

* 

obtained  directly  from  the  IMM  estimator. 

Note  that  the  r  Riccati  equations  (4.13)  are  coupled  via  (4.9)  -  which 
is  a  mixing  or  interaction  of  the  results  from  the  previous  iteration.  These 
equations  can  be  implemented  naturally  with  r  parallel  processors  that 
interact  via  (4.9)  after  each  iteration.  This  and  the  fact  that  the  IMM  is 
also  parallelizable  in  the  same  manner  motivates  the  name  PCA. 

Finally  we  note  that  if  we  were  to  take  into  account  the  dual  effect  the 
complexity  of  the  algorithms  would  have  precluded  real-time  implementability. 
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5.  Simulation  Results 

The  algorithm  developed  in  Sec.  4  is  used  to  control  the  state  trajectory 
of  a  system  that  can  jump  between  two  models.  The  performance  of  this 
algorithm,  given  by  (2.1),  is  compared  to  the  cost  obtainable  by  using  the  MMP 
controller  discussed  in  IT1]  and  the  FT  scheme  derived  in  [C21.  The  MMP  and  FT 
schemes  both  take  into  account  the  entire  "tree"  of  future  model  sequences. 
However  these  algorithms  differ  significantly  in  thei/~  derivations.  The  MMP 
control  is  computed  as  a  probabilistically  weighted  sum,  over  all  possible 
model  sequences,  of  the  "model  sequence"  optimal  controls.  In  the  FT  scheme 
the  expected  optimal  cost-to-go  is  approximated  as  a  probabilistically  weighted 
sum,  over  all  possible  model  sequences,  of  the  "model  sequence"  expected 
optimal  costs-to-go.  In  order  to  obtain  a  meaningful  comparison  we  use  the 
rigorous  statistical  analysis  technique  presented  in  [B4,  W2). 

The  control  of  a  double  integrator  system  with  process  and  measurement 
noises  is  considered  with  a  gain  failure.  The  two  possible  models  are  given  by 
the  Following  system  equation 

x'(k+l)  =  [  J  i  ]  x‘(k)  *  £  J  u(k)  +  [T2t/Z]  v(k)  M’2  l5A) 

with  measurement  equation 

z(k)  =  [1  01  x'(k)  ♦  w(k)  (5.2) 

The  models  differ  in  the  control  gain  parameter  b*.  The  process  and 
measurement  noises  are  mutually  uncorrelated  with  zero  mean  and  variances  given 
by 

Elv(k)  v(J)J  -  0.16  6kJ  (5.3) 


and 
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E(w(k)  w(j)]  »  5kj  (5 .4) 

The  control  gain  parameters  were  chosen  to  be  ^=2  and  b2=0.5. 

The  Markov  transition  matrix  was  selected  to  be 


T  0.8  0.2  ”| 

L  0.1  0.9  J 


(5.5) 


For  this  example  N=7,  and  the  cost  parameters  R(k)  and  Q(k),  (see  (2.1)),  were 
selected  as 


R(k)  =  5.0 


k=l,2,...,N— 1 


(5.6) 


and 


0(0) 

• 

"  0.0 
0.0 

0.0  1 
0.0 

0(1) 

2.0 

0.0 

0.0 

2.0 

Q  ( 2 ) 

3.0 

0.0 

0.0 

3.0 

0(3) 

4.0 

0.0 

0.0 

4.0 

QH) 

5.0 

0.0 

0.0 

5.0 

0(5) 

§:8 

8:8 

0(6) 

5.0 

0.0 

0.0 

5.0 

0(7) 

20.0 
^  0.0 

0.0 
8.0  J 

(5.7) 


where  the  last  matrix,  Q( 7),  reflects  our  desire  to  drive  xt(7)  vigorously  to 
zero.  The  sampling  period  for  this  example  was  T=1.0. 

The  real  system  was  initialized  with  x{0)=[30.0,  0.0]'  and  a  random 
selection  was  done  for  choosing  the  initial  model  with  P{M(0)=ij=0.5,  i=l,2. 
The  Kalman  filters  each  received  an  initial  state  covariance  of 


P(0|C) 


[“  1.0  1.0"] 

L  1.0  2.0  J 


(5.8) 


and  the  initial  state  estimate  was  based  on  initial  noisy  measurements 


“  *,(010)  1  ^  r  z(Q)  ~] 

_  x2(0|0)  L  z(0)  -  z(-l)  J 


(5.9) 


where  z(-l)  =  30.0  +  w(-l)  and  z{0)  *  30.0  ♦  w{0). 

Statistical  tests  were  made  on  the  results  of  50  Monte  Carlo  runs.  Sample 
means  and  variances  of  the  cost  defined  in  (2.1)  were  computed  for  the  MMP,  FT, 
PCA,  and  "known  model-history”  (i.e.,  the  unrealizable  optimum  linear- 
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quadratic)  controllers. 

Table  I  contains  the  results.  The  FT  and  PCA  algorithms  show  a  clear 
reduction  in  cost  as  compared  with  the  MMP  scheme.  However  in  order  to  provide 
a  rigorous  argument  that  the  actual  performance  is  ordered  as  Table  I  indicates 
we  apply  the  statistical  test  presented  in  [B4,  W2I. 

Table  II  contains  the  results.  The  sample  mean  A  and  the  standard 
deviation  Or  of  the  means  of  the  cost  differences,  with 

realizations  cfCA-c[T,  C^MP-c[T,  and  cfMP-cfCA,  for  the  i-th  run  of  each  simulation, 
are  shown.  The  hypothesis  that  ‘.he  FT  controller  is  better  than  the  MMP  or  PCA 
schemes  can  be  accepted  only  if  the  probability  of  error  a  is  less  than,  say, 

1  percent.  Then  the  threshold  against  which  we  compare  the  test  statistic 
A/cs  is  p  =  2.33.  This  test  statistic  has  to  exceed  the 

threshold  in  order  to  accept  the  hypothesis. 

The  results  given  in  Table  II  indicate  that  the  FT  and  PCA  controllers 
perform  significantly  better  than  the  MMP  controller  for  this  problem.  The 
estimated  improvements  {decrease  in  costs)  of  70 7.  and  697.  respectively  are 
statistically  significant.  However  the  hypothesis  that  the  FT  controller  is 
better  than  the  PCA  controller,  when  using  p  =  2.33  or  p  =  1.65  (a  =  57.), 
can  not  be  accepted.  The  estimated  improvement  of  17.  is  not  statistically 
significant  and  their  performances  are,  thus,  practically  the  same. 


*1 
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TABLE  I 

SAMPLE  AVERAGE  COSTS  AND  STANDARD  DEVIATIONS 


Sample  Mean 
(Sample  Standard 
Deviation 


MMP 

19,519 

23,4*16 


FT 

6,063 

4,404 


PCA 

6,141 

4,236 


u  Known  ■ 
Model-History) 

2,647 

630 


TABLE  II 


STATISTICAL  TEST  FOR  ALGORITHM  COMPARISONS 


Stafistic 

_  Estimated 
Cost  Reduction 

A 

Z/a5 

7. 

PCA- 

-FT 

78 

82 

.95 

1 

MMP- 

-FT 

13,456 

3,316 

4.1 

A 

70 

MMP- 

■PCA 

13,378 

3,298 

4.1 

69 
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6.  Conclusion 

The  development  of  a  new  control  algorithm  for  discrete-time  hybrid 
stochastic  systems  with  Markovian  jump  parameters  has  been  presented.  This 
scheme  has  off-line  computable  control  gain  parameters,  and  is  impiementabie  on 
parallel  processors  in  a  natural  way.  This  controller  is  based  on  a  fixed 
number  r  (the  number  of  models)  of  coupled  Riccati  equations  and  is  suitable 
for  control  problems  with  finite  end-times.  This  scheme  uses  the  IMM  state 
estimation  algorithm.  We  show  that,  there  is  natural  connection  between  the  IMM 
state  estimator  and  the  control  of  jump-linear  hybrid  systems. 

From  the  example  it  is  seen  that  this  scheme  can  achieve  a  statistically 
significant  reduction  in  cost  when  compared  to  the  scheme  of  [Til.  Also  we 
showed  that  the  present  controller  and  the  controller  of  [C2]  have 
statistically  indistinguishable  costs.  But  our  new  controller  has  a  fixed 
amount  of  computations  at  each  step  of  the  dynamic  programming  recursion 
whereas  the  schemes  of  [C2]  and  [Tl]  have  an  exponentially  growing  number  of 
computations.  Thus  our  new  controller  is  seen  to  compare  favorably  to  both  the 
[Tl]  and  [C2]  schemes. 
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Defining 

(ijH(k+l|k)  £  P{M(k+l}=j|M(k+2)=i,Ik}  (A.3) 

and  noting  that 

E{x(k)jlk,M{k+2)=l,M(k-tl)=j}  =  E{x(k)|lk,M[k+l)=j} 

=  x°J(k|k)  (A.4) 

[see  (3.8)],  and  taking  a  gradient  w.r.t.  u(k)  of  (A.2)  and  setting  to  zero 
yields  uPCA(k)  in  H.19). 
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11.  Calculation  of  P(M(k+2l=HHk) 

Using  Bayes'  formula  one  can  write 

P{M(k+2)=i]|Ik}  =  i  P{M(k+2)=i)iM(k+l)=j,Ik)P{M(k+l)=j)|Ik) 
j=i 

=  i  p..P{M(k+l)=j)|lk}  =  i  p„  i  p..P{M(k)=l)|lk>  (A.5) 

j=i  J‘  j=i  J1  1=1  |J 

L 

where  the  conditional  model  probabilities  P{M(k)=*l)|I  },  1=1 . r,  are 

obtained  from  the  IMM  estimator. 
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Discrete  Time  Point  Process  Filter  for  Image  Based  Target  Mode  Estimation 
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Abstract 

The  performance  of  tracking  and  prediction  systems  of  a  maneuvering  target 
can  be  improved  by  using  additional  (and  unconventional)  measurements  of  its 
apparent  modes,  typically  provided  by  an  imaging  sensor.  A  model  for  the 
image-based  observation  channel  for  target  mode  estimation  in  discrete  time  is 
presented  in  this  paper.  A  multidimensional  point  process  filter  is  obtained 
by  making  use  of  the  discrete  time  point  process  theory  and  its  utilization  is 
iiiustrated  through  simulation  examples. 
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Abstract  :  We  consider  the  Jump  Linear  Quadratic  Problem  where  lin¬ 
ear  state  dynamics  are  made  contingent  upon  the  Markovian  transition  of 
a  regime  variable.  It  is  desired  to  regulate  the  state  while  minimizing  a 
quadratic  performance  index.  In  the  case  of  partial  observations  the  ex¬ 
act  solution  has  proved  to  be  elusive  and,  in  this  paper,  we  present  a  new 
approximation  based  on  the  optimal  solution  of  an  averaged  version  of  the 
original  problem. 
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1  Introduction 


There  are  many  applications  where  the  state  dynamics  are  disturbed  by 
random  point  processes  :  in  fault-tolerant  process  control  for  example  the 
input-output  plant  model  is  contingent  upon  the  indicator  of  the  failure 
regime.  Similarly  a  tracking  system  for  maneuvering  targets  needs  to  be 
reconfigured  when  the  mode  of  the  encounter  switches  from  level  flight  to 
an  evasive  maneuver. 

In  this  paper  we  study  a  control  problem  for  the  class  of  systems,  often 
called  ” hybrid  systems”,  that  has  been  proposed  in  the  literature  to  describe 
such  applications.  Hybrid  systems  are  characterized  by  their  product  state 
space.  To  the  usual  Euclidean  space  (R")  we  append  a  finite  set  (S  = 
{l,2,...,iV})  :  on  R"  we  can  model  the  basic  dynamics  (e.g.  position  and 
velocity  of  the  target  for  a  tracking  system)  and  we  use  S  as  the  list  of 
possible  regimes  of  operation  (e.g.  with  /V  =  2  the  maneuver/no  maneuver 
mode  of  a  tracked  target).  The  regime  jumps  on  S  are  modeled  by  a  Markov 
chain  and  the  dynamics  on  R"  obey  a  differential  equation,  the  coefficients 
of  which  are  contingent  upon  the  regime. 

The  study  of  h>  brid  systems  can  be  traced  back  to  the  early  sixties  (Flo- 
rentin,  1961,  Krasovskii  and  Lidskii,  1961).  The  Jump  Linear  Quadratic 
problem  was  introduced  by  (Sworder,  1969):  the  state  dynamics  being  lin¬ 
ear,  an  optimal  regulation  problem  is  posed  with  respect  to  a  quadratic 
performance  index.  For  completely  observed  state  and  regime,  the  opti¬ 
mal  JLQ  regulator  has  been  obtained  independently  by  (Sworder,  1969) 
and  (Wonham,  1971)  from  maximum  principle  and  dynamic  programming 
points  of  view.  Motivated  by  various  applications,  significant  research  ef¬ 
forts  have  been  devoted  to  hybrid  systems  theory  :  the  structure  of  linear 
hybrid  systems  has  been  analyzed  in  (Chizeck,  1982,  Mariton,  1988),  re¬ 
fined  models  have  been  considered  in  (Sworder  1973,  1980,  1982)  and  the 
theory  has  also  been  extented  to  the  discrete  time  setting  (Griffiths  and 
Loparo,  1985,  Chizeck  et  al.,  1986,  Ezzine  and  Haddad,  1988).  Surveys  of 
available  results  are  given  in  (Sworder,  1976,  Mariton,  1989). 

More  recently  attention  has  been  focused  on  the  JLQ  problem  with 
partial  observations,  i.e.  the  case  where  the  state  and/or  the  regime  is  only 
partially  measured  through  noisy  sensors.  The  most  general  setting  is  that 
of  (Caines  and  Chen,  1985)  and  several  approximations  have  been  proposed 
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(Sworder  and  Chou,  1985,  Fragoso,  1988,  Mariton,  1988). 

A  major  difficulty  is  the  dual  control  phenomenon  :  due  to  the  couplings 
between  the  state  and  the  regime,  the  optimal  control  has  to  balance  its 
proper  regulation  objective  with  the  need  to  excite  the  plant  to  gain  more 
information  on  the  unobserved  regime  of  operation.  In  this  paper  we  present 
a  new  approximation  scheme  based  on  an  averaging  of  the  exact  system 
dynamics. 

The  rest  of  the  paper  is  organized  as  follows.  In  section  2  we  formulate 
the  JLQ  problem  with  partial  observations  and  the  main  result  is  presented 
in  section  3  in  the  form  of  the  optimal  solution  of  an  averaged  JLQ  problem. 
Finally  an  example  illustrates  the  solution  obtained. 


2  Problem  formulation 


We  consider  the  JLQ  problem  as  formulated  in  e.g.  (Sworder  1969,  Caines 
and  Chen,  1985).  The  plant  state  xteRn  obeys 

dxt  =  A(rt)xtdt  +  B[rt)utdt  +  D(rt)dwt  (l) 

where  uteRm  is  the  control  vector,  iut  a  normalized  Brownian  motion,  and 
A,  B  matrices  of  corresponding  dimensions.  These  matrices  depend  on  the 
current  regime  of  operation,  rteS  =  (l,2, ...,  JV}  and  we  shall  often  use  an 
index  to  denote  the  regime,  e.g.  for  A(rt)  when  r,  =  i.  The  regime 
jumps  are  described  by  a  Markov  chain 


dd>t  —  IT  '<f>t  +  dmt 


(2) 


where  <f>(  is  the  regime  indicator  (<^>t e {0,  1}*V,</>M-  =  1  when  rt  —  :,  0  other¬ 
wise),  mt  is  a  martingale  w.r.t.  to  the  underlying  system  a- algebra  and 
n  ~  (ir.;).J=  i jv  is  the  matrix  of  transition  rates.  The  role  of  II  is  better 
understood  by  observing  that  (2)  implies 


S>(rt+./<  =  j\r  t  =  »} 


TT;jdt  +  o(dt)  i  £  j 
1  +  iTadt  -r  o(dt)  i  =  j 


The  sn- of  II  are  thus  the  transition  rates  of  the  regime  process.  We 
shall  asr-'ua  that  xt  is  exactly  observed  but  that  the  regime  r(  is  measured 
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through  a  noisy  scalar  channel 

dyt  —  h(rt)dt  +  dvt  (3) 

where  vt  is  a  Brownian  motion  with  intensity  It  is  assumed  that  v,  w 
and  m  are  mutually  independent.  The  observation  cr-algebra  available  at 
the  regulator  is  thus  XtvYt  where  Xt  =  cr -  {x,,s  <  t}  and  Yt  =  a  -  {j/.„  s  < 
t}.  The  best  regime  estimate  in  the  sense  of  minimal  mean  square  error 
would  then  be  E{<f>t\XtvYt}.  However  the  control  ut  influences,  through 
(l),  the  information  richness  of  Xt  and  it  is  clear  that  using  E{<j>t\XtvYt} 
would  lead  to  a  dual  control  problem.  We  shall  thus  assume  that  the  direct 
measurement  (3)  is  of  high  quality  so  that  we  can  take  <f>t  =  B{4>t\Yt } 
without  a  significant  degradation  of  the  regime  estimate. 

This  assumption  can  be  understood  in  terms  of  the  target  tracking  ap¬ 
plication  :  when  the  target  maneuvers  it  takes  some  time  until  the  position 
and  velocity  (=  components  of  zt)  reflect  the  change  in  acceleration  while, 
on  the  contrary,  a  direct  sensor  like  (3)  (maybe  the  output  of  an  imaging  de¬ 
vice)  can  immediately  signal  a  transition  (maybe  through  a  sudden  change 
of  the  apparent  dimensions  of  the  target).  It  is  then  interesting,  especially 
during  the  short  transients  where  fast  reconfiguration  is  required,  to  disre¬ 
gard  the  slow  information  channel  of  Xt  and  to  base  the  regime  estimate 
solely  on  Yt.  Obviously  there  are  situations  where  this  assumption  is  not 
acceptable  and,  by  enforcing  it,  we  restrict  attention  to  a  special  class  of 
hybrid  systems. 

The  practical  control  objective  is  to  stabilize  xt  near  zero  without  spend¬ 
ing  too  much  control  and  this  can  be  achieved  by  minimizing  the  perfor¬ 
mance  index 


J  =  E{[  [x'tQx,  +  u'tRut)dt\xtn  =  x0,<j>t„  =  4>0}  (4) 

The  weighing  matrices  Q  (>  0)  and  R  (>  0)  are  regime  dependent  ( Q  — 
Q[rt),R  =  #(r{)).  Technical  conditions  are  necessary  to  ensure  that  min¬ 
imizing  (4)  indeed  stabilizes  the  system.  In  the  JLQ  setting  some  care 
is  required  in  defining  the  most  accurate  notions  of  stabiiizability  (con¬ 
trollability)  and  detectability  (observability).  This  was  pursued  in  (Ji  and 
Chizeck,  1988,  Mariton,  1986)  but  here  we  shall  take  the  simpler  condition 
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that  the  pairs  [A,-,  £?,■]  and  [A,-,  Q)^~\  are,  respectively,  controllable  and  ob¬ 
servable  in  the  usual  deterministic  sense.  The  class  of  admissible  control 
policies  U  is  the  class  of  feedback  laws  ut  =  U{x,,y1,,s  <  t)  where  U  satis¬ 
fies  the  usual  smoothness  conditions  (Wonham,  1971).  The  problem  to  be 
solved  is  thus 

min  J  (5) 

i 'eu  '  ’ 

The  partially  observed  problem  is  transformed  into  a  completely  observed 

one  in  the  augmented  state  (x't,<f>'t)\  where  it  is  recalled  that  we  use  the 

approximation  E{d>,iyf}  ss  E{<f>,\XtvY,}  for  4>t. 

3  The  averaging  approximation 

For  the  regime  dynamic  equation  (2)  and  the  measurement  channel  (3), 
the  estimate  4>t  —  E{4>t\Yt }  is  given  by  the  following  stochastic  differential 
equation  (Wonham,  1965,  Wong  and  Hajek,  1985)  : 

d$,  =  n'$,dt  +  C(4>,)dyt  (6) 

where  the  innovation  process  is  dyt  =  dyt  -  H'(j>tdl  with  H  =  (/ii, ...,  hN\. 
The  filter  gain  G($t)  is 

G{$t)  =  {diagfa)  - 

As  explained  above,  we  do  not  pursue  the  exact  solution  of  (5)  but  rather 
look  for  an  approximate  solution  with  a  strong  practical  appeal.  We  trans¬ 
form  the  problem  into  a  completely  observed  one  as  follows. 

First  the  dynamics  are  averaged  into 


dxt  =  Atxtdt  4-  Btutdt  +  D,dwt 


(7) 


with 


N 

A,  -  £{A(r,)jyt}  =  Y,$tiAi 

.  i=i 

i=i 
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Dt=  E{D(rt)\Yt}  =  Z$tiD: 

i=i 

Similarly  the  cost  matrices  are  replaced  by 

Qt  =  E{Q(rt)\Yt}  =  t,M< 

;=i 

N 

Rt  =  E{R(rt)\Yt}  =  YJnRi 

i—  1 

This  averaging  provides  an  acceptable  approximation  of  the  original  prob¬ 
lem  when  the  regime  estimate  based  on  (3)  is  a  good  regime  indicator. 

For  the  averaged  system  (7)  we  have  a  completely  observed  optimization 
problem  that  we  solve  through  dynamic  programming  (Wonham,  197 1). 
T'  2  cost-to-go  is  defined  as 


V  (t,x,4>)  =  E{J^  (Z'Q.,x,  +  u\R..uA)ds\xt  =  x,  Zt  -  <£}  (8) 

The  infinitesimal  generator  is 

~  dV  dV  ~  . 

CV(t,xt  =  X,<t>,  =  4>)  =  —  [t,x,4>)  -f  —  (t,x,<j>)'(A,x  -1-  Dtu) 

dV  ~ 

aq> 


(9) 


1  d‘V  1  d2V 

+^’mGa’)+^[b'^0) 


We  then  have  the  following  result: 


Theorem  : 

The  solution  of  the  optimization  problem  for  the  averaged  model  is 


u\  =  -R~lB‘  Axt 

Where  the  n  x  n  matrix  A  satisfies  a  Cauchy  equation 

V 

=  a1  A  +  AA  -  ABR~lB’A  +  Q 

at 

<r—>  0  A  .  ,  j  .  A  A  d-A  „  „  , 
+  E  7J“(n  &).'  +  E  E  ■?  T?  G>GJav 

i  =  i  dfPt »  ;=i»  =  i  drtidfptj 


(10) 


00 


6 


with  A =  0. 

The  corresponding  optimal  performance  is 
J'  =  ztllA  {t„,$t0)xtl,  + 
where  the  scalar  n  satisfies 


2  dfadft 


(12) 


(13) 


with  =  0. 


Proof :  see  the  appendix. 


This  result  calls  for  the  following  remarks. 


Remark  1  : 

The  obtained  control  feeds  back  both  the  measured  state  x,  and  the 
regime  estimate  <£,,  thus  providing  a  continuous  adaptation  to  regime  vari¬ 
ations.  Under  general  conditions  on  the  continuity  of  the  coefficients  of  (6), 
(7)  the  Cauchy  equation  (ll)  has  a  non  negative  solution  that  is  uniformly 
bounded  over  |  x  R,v  (Fleming  and  Rishel,  1975). 

Remark  2  : 

A  similar  averaging  approximation  was  reported  in  (Lee  et  al.,  1985)  for 
the  case  where  the  random  influence  on  the  model  parameters  is  described 
by  a  Brownian  motion  rather  than  our  Markov  chain. 

Remark  3  : 

Ir,  previous  studies  (Sworder  and  Chou,  1985,  Mariton,  1988)  an  ap¬ 
proximate  solution  was  derived  for  the  exact  original  optimal  control  prob¬ 
lem.  On  the  contrary  the  above  theorem  provides  the  exact  solution  to 
a  modification  of  the  original  problem  obtained  through  an  averaging  ap¬ 
proximation.  Still  another  approach  is  possible  whereby  the  solution  with 
a  measured  regime  (Sworder,  1969)  is  averaged  with  weights  given  by  the 
a  posteriori  regime  probabilities.  It  was  shown  in  (Fragoso,  1988)  that 
this  provides  the  optimal  solution  to  a  modified  problem  with  a  different 
quadratic  cost. 
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4  Example 

To  illustrate  the  results  of  §  3  we  consider  a  scalar  system  with  two  regimes 
dxt  =  a(r,)xtdt  4-  b(rt)ucdt  4-  d(rt)dwt 

and  we  denote  by  a,-,  6/  and  d ,  the  values  of  a,  b  and  d  when  rt  =  t(i  =  1,2). 
The  transition  matrix  of  the  regime  Markov  chain  is 

7Ti  i  7T jo 
7To  i  7TOO 


The  averaged  parameters  are  written 

U  —  (fl. i  —  CZ o ) (^ [  [  4"  ZZ2 

6  =  (6,  -  4-  b-y 

d  —  (</|  d2 )  4"  ti’> 

with  4>t\  =  /?{r,  =  liK,}. 

From  the  above  theorem  we  then  obtain 

.  (^i  ~  bz)4>ti  +  v 

ut  =  -- - -x - A  (<pti)xt 

(rt  -  r2)0n  +  Tn 

where  A(<£a)  is  solution  of  the  Cauchy  equation 


=  (<7 i  -  72 )<i>n  +  72  4-  2 { ( a ,  -  a2)<£a  4-  a2|A 

[(6t  -  b2)d>n  4-  bz]"  ■>  ,  7  ,  7  y,  ^ 

- - - rj - A  4-  4-  7r2l(l  - 

(r,  -  rz)<t>n  4-  r2  (J<pn 

..72/.  7  ,2(/h-M232A 

4-40tl(l  -  <ptl)  - —^r 

° :•  d<t>h 


with  A(£7, 0t/i)  =  0. 

The  possibility  of  regime  transitions  is  reflected  in  the  dcpendance  of  A 
on  <}>t j.  The  implementations  of  (M),  (15)  requires  the  on-line  solution  of 
a  two  point  boundary  value  problem  :  (15)  is  integrated  backward  in  time 
with  coefficients  depending  on  regime  estimates. 
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5  Conclusion 


We  considered  an  opvimal  control  problem  for  a  Markovian  jump  linear 
system.  A  new  approximation  was  proposed  based  on  an  averaging  of 
the  regime  dependent  parameters  based  on  a  high  quality  regime  estimate. 
This  transformed  the  original  partially  observed  problem  into  a  completely 
observed  one. 

Future  work  will  analyze  the  stability  of  the  original  system  under  the 
proposed  control  law  and  it  will  be  interesting  to  analyze  the  error  between 
the  true  system  and  the  averaged  one. 
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7  Appendix 

We  take  a  quadratic  parametrization  of  the  cost-to-go 


V  [t,  xt,<j>,)  =  z'A(t,|,):r,  -i-  /r(t,<£() 

The  principle  of  optimality  then  gives 

min,t{CV(t,xt,$t)  +  x'tQ{$,)xt  +  u'tR($e)ut}  =  0 

The  partial  derivative  of  the  above  expression  w.r.t.  u  is 

.  ~  dV  -  ^ 

fH6iy~{L,xt,<j>,)  \- Ft(0t)u, 

oxt 


so  that 


c)V 


u; B(<f>,y— (t, x,,#,) 


(16) 


(17) 


Using  u't  into  Bellman’s  equation  we  obtain  an  expression  for  A  First  we 
evaluate  separately  the  following  partial  derivatives 


and 


d 


(i'{a  [i,<i>t)xtyi\'$t 


,  f 

xi  L->  TT 


d<f>„ 


( J« A  (* .  ^« )  x<K;  o'i )  41 E  c,  ,o;\xt 

o<t>,d<j)t  J  =  l  1  =  1  d<ptid<ptj 


Grouping  terms  in  x\xt  and  constant  terms  we  finally  obtain  (ll)  and  ( 1 3) 
of  the  theorem.  The  optimal  cost  (l2)  is  then  deduced  from  the  definition 
of  the  cost-to-go  at  t  -  t„. 
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